Skip to content

Bodo Platform SDK

Bodo Platform SDK is a Python library that provides a simple way to interact with the Bodo Platform API. It allows you to create, manage, and monitor resources such as clusters, jobs, and workspaces.

Getting Started: Creating a Bodo SDK client

The first step is to create an API Token in the Bodo Platform for Bodo SDK authentication.

Navigate to API Tokens in the Admin Console to generate a token. Copy and save the token's Client ID and Secret Key and use them to define a client (BodoClient) that can interact with the Bodo Platform.

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)

Alternatively, set BODO_CLIENT_ID and BODO_SECRET_KEY environment variables to avoid requiring keys:

from bodosdk.client import get_bodo_client

client = get_bodo_client()

To get workspace data, you can access the workspace_data attribute of the client:

from bodosdk.client import get_bodo_client

client = get_bodo_client()
print(client.workspace_data)

Additional Configuration Options for BodoClient

  • print_logs: defaults to False. All API requests and responses are printed to the console if set to True.
from bodosdk.client import get_bodo_client
from bodosdk.models import WorkspaceKeys

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys, print_logs=True)

Jobs

Module responsible for managing jobs in a workspace. Bodo Platform Batch Jobs are a way to run a script on a cluster. The script can be a Python script, a SQL script, or a script from a Git repository or S3 bucket. The script can be run on a cluster with a specific configuration.

Create a batch job definition

BodoClient.job.create_batch_job_definition(job_definition: CreateBatchJobDefinition)

Creates a batch job definition in the given workspace.

  • Example 1: Create batch job definition for a workspace source script
  from bodosdk.models import WorkspaceKeys,BatchJobDefinition
  from bodosdk.client import get_bodo_client
  from bodosdk.models.job import CreateBatchJobDefinition, JobConfig, JobSource, JobSourceType, SourceCodeType, \
      WorkspaceDef, RetryStrategy

  keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
  )
  client = get_bodo_client(keys)

  workspace_source_def = JobSource(
      type=JobSourceType.WORKSPACE,
      definition=WorkspaceDef(
          path="Example-path/batch-job-defs",
      ),
  )

  retry_strategy = RetryStrategy(
      num_retries=1,
      retry_on_timeout=False,
      delay_between_retries=2,
  )

  jobConfig = JobConfig(
      source=workspace_source_def,
      source_code_type=SourceCodeType.PYTHON,
      sourceLocation="test.py",
      args=None,
      retry_strategy=retry_strategy,
      timeout=10000,
      env_vars=None,
  )

  createBatchJobDef = CreateBatchJobDefinition(
      name="test-job",
      config=jobConfig,
      description="test-batch-job-description-attempt",
      cluster_config={
          "bodoVersion": "2023.1.3",
          "instance_type": "c5.2xlarge",
          "workers_quantity": 2,
          "accelerated_networking": False,
      }, )

  jobdef = client.job.create_batch_job_definition(createBatchJobDef)
  • Example 2: Create batch job definition for a git source script
from bodosdk.models import WorkspaceKeys, CreateBatchJobDefinition, BatchJobDefinition
from bodosdk.client import get_bodo_client
from bodosdk.models.job import CreateBatchJobDefinition, JobConfig, JobSource, JobSourceType, SourceCodeType, \
    WorkspaceDef, RetryStrategy

keys = WorkspaceKeys(
  client_id='XYZ',
  secret_key='XYZ'
)
client = get_bodo_client(keys)

git_source_def = JobSource(
    type=JobSourceType.GIT,
    definition=GitDef(
        repo_url='https://github.com/Bodo-inc/Bodo-examples.git',
        username='XYZ',
        token='XYZ'
    ),
)

retry_strategy = RetryStrategy(
    num_retries=1,
    retry_on_timeout=False,
    delay_between_retries=2,
)

jobConfig = JobConfig(
    source=git_source_def,
    source_code_type=SourceCodeType.PYTHON,
    sourceLocation="test.py",
    args=None,
    retry_strategy=retry_strategy,
    timeout=10000,
    env_vars=None,
)

createBatchJobDef = CreateBatchJobDefinition(
    name="test-job",
    config=jobConfig,
    description="test-batch-job-description-attempt",
    cluster_config={
        "bodoVersion": "2023.1.3",
        "instance_type": "c5.2xlarge",
        "workers_quantity": 2,
        "accelerated_networking": False,
    }, )

jobdef = client.job.create_batch_job_definition(createBatchJobDef)
  • Example 3. Create batch job definition for a S3 source script

To run a script file located on an S3 bucket, the cluster must have the required permissions to read the files from S3. This can be provided by creating an Instance Role with access to the required S3 bucket. Please make sure to specify an Instance Role that should be attached to the Job Cluster. The policy attached to the roles should provide access to both the bucket and its contents.

from bodosdk.client import get_bodo_client
from bodosdk.models.job import RetryStrategy
from bodosdk.models import (CreateRoleDefinition, InstanceRole, JobConfig, JobSource, JobSourceType, S3Source, SourceCodeType, WorkspaceKeys, CreateBatchJobDefinition)

keys = WorkspaceKeys(
  client_id='XYZ',
  secret_key='XYZ'
)
client = get_bodo_client(keys)

git_source_def = JobSource

keys = WorkspaceKeys(client_id='XYZ', secret_key='XYZ')

role_definition = CreateRoleDefinition(
  name="test-sdk-role-creation",
  description="testing",
  data=InstanceRole(role_arn="arn:aws:iam::accountID:role/name")
)

list_of_instance_roles = client.instance_role.list()

role_to_use = None
for role in list_of_instance_roles:
  if role.name == 'role_i_want_to_use':
    role_to_use = role
    break

s3_job_source = JobSource(
  type=JobSourceType.S3,
  definition=S3Source(
    bucket_path='s3://path-to-my-bucket/my_job_script_folder/',
    type=JobSourceType.S3,
    bucket_region='region',
  ),
)

createBatchJobDef = CreateBatchJobDefinition(
  name="test-job",
  config=JobConfig(
    source=s3_job_source,
    source_code_type=SourceCodeType.PYTHON,
    sourceLocation="to_sql.py",
    args=None,
    timeout=10000,
    env_vars=None),
  description="test-batch-job-description-attempt",
  cluster_config={
    "bodo_version": "2023.1.3",
    "instance_type": "c5.2xlarge",
    "workers_quantity": 2,
    "accelerated_networking": False,
    "instance_role_uuid": role_to_use.uuid,
  })

jobdef = client.job.create_batch_job_definition(createBatchJobDef)

List batch job definitions

BodoClient.job.list_batch_job_definitions(page: int, size: int, order: PaginationOrder)

Parameters:

Parameter Type Description Required
page int The page number has a lower limit of 1, default value 1 No
size int The size has a maximum allowed value of 100 and default value of 5 No
order PaginationOrder The order in which the elements would be displayed. Default is created_at ASC No

Returns: Return lists all batch job runs in the given workspace filtered by given parameters - List[JobRunResponse]

Lists all batch job definitions in the given workspace.

Example:

from typing import List

from bodosdk.models import PersonalKeys, WorkspaceKeys, JobConfig, SourceCodeType, RetryStrategy, JobSourceType, \
    WorkspaceDef, CreateBatchJobDefinition
from bodosdk.client import get_bodo_client
from bodosdk.models.job import CreateJobRun, JobSource, JobRunStatus, BatchJobDefinitionResponse

keys = WorkspaceKeys(
    client_id="XYZ",
    secret_key="XYZ"
)

client = get_bodo_client(keys)
jobdefs: List[BatchJobDefinitionResponse] = client.job.list_batch_job_definitions()


Get batch job definition by id

BodoClient.job.get_batch_job_definition(job_definition_id: str)

Gets specific batch job definition by id.

Example:

from typing import List

from bodosdk.models import PersonalKeys, WorkspaceKeys, JobConfig, SourceCodeType, RetryStrategy, JobSourceType, \
    WorkspaceDef, CreateBatchJobDefinition
from bodosdk.client import get_bodo_client
from bodosdk.models.job import CreateJobRun, JobSource, JobRunStatus, BatchJobDefinitionResponse

keys = WorkspaceKeys(
    client_id="XYZ",
    secret_key="XYZ"
)

client = get_bodo_client(keys)
jobdef: BatchJobDefinitionResponse = client.job.get_batch_job_definition('04412S5b-300e-42db-84d4-5f22f7506594')


Get batch job definition by name

BodoClient.job.get_batch_job_definition_by_name(name: str)

Return the batch job definition based on the name provided

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.job import BatchJobDefinitionResponse

keys = WorkspaceKeys(
    client_id="XYZ",
    secret_key="XYZ"
)

client = get_bodo_client(keys)
jobdef: BatchJobDefinitionResponse = client.job.get_batch_job_definition('batch-job-1')


Remove batch job definition

BodoClient.job.remove_batch_job_definition(job_definition_id: str)

Removes specific batch job definition by id.

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
    client_id="XYZ",
    secret_key="XYZ"
)

client = get_bodo_client(keys)
client.job.remove_batch_job_definition('04412S5b-300e-42db-84d4-5f22f7506594')


Submit a batch job run

BodoClient.job.submit_batch_job_run(job_run: CreateJobRun)

Submits a job run for a given batch job definition.

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.job import CreateJobRun

keys = WorkspaceKeys(
    client_id="XYZ",
    secret_key="XYZ"
)

client = get_bodo_client(keys)
client.job.submit_batch_job_run(CreateJobRun(batchJobDefinitionUUID='04412S5b-300e-42db-84d4-5f22f7506594', clusterUUID='12936Q5z-109d-89yi-23c4-3d91u1219823'))

List batch job runs

BodoClient.job.list_batch_job_runs()

Parameters:

Parameter Type Description Required
batch_job_ids List[UUID] List of Ids of the batch job definitions No
statuses List[JobRunStatus] List of Job Run status as filters No
cluster_ids List[UUID] List of cluster ids as filters No
started_at List[UUID] started at time stamp filter No
finished_at List[UUID] finished at timestamp filter No
page int The page number as integer. Default page 1 No
page_size int The number of elements in a page. Default page_size = 10 No
order PaginationOrder The order of listing for the job run ordered by created_at No

Returns: Return lists all batch job runs in the given workspace filtered by given parameters - List[JobRunResponse]

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.job import JobRunStatus

keys = WorkspaceKeys(
  client_id="XYZ",
  secret_key="XYZ"
)

client = get_bodo_client(keys)
jobruns = client.job.list_batch_job_runs(statuses=[JobRunStatus.FAILED],
                                         cluster_ids=['ba62e653-312a-490e-9457-71d7bc096959'])


List batch job runs by batch job name

BodoClient.job.list_job_runs_by_batch_job_name()

Parameters:

Parameter Type Description Required
batch_job_names List[str] List of Ids of the batch job definitions No
statuses List[JobRunStatus] List of Job Run Statuses No
cluster_ids List[str] Cluster IDs filter for the batch job run No
started_at List[str] Started at time filter No
finished_at List[str] Finished at time filter No
page List[str] Page no to fetch. Default value 1 No
page_size int No of elements in a page. Default value 10 No
order PaginationOrder Job run sorting order by created_at. Default value ASC No

Lists all batch job runs in the given workspace filtered by given parameters.

Returns: Returns a list of JobRunResponse - List[JobRunResponse]

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.job import JobRunStatus

keys = WorkspaceKeys(
    client_id="XYZ",
    secret_key="XYZ"
)

client = get_bodo_client(keys)
jobruns = client.job.list_job_runs_by_batch_job_name(batch_job_names=['production-job-1'], statuses=[JobRunStatus.FAILED], cluster_ids=['ba62e653-312a-490e-9457-71d7bc096959'])


Get batch job run

BodoClient.job.get_batch_job_run(uuid: str)

Returns batch job run based on the job_run_id provided.

Parameters:

Parameter Type Description Required
uuid Union[str, UUID] Returns batch job run for the id Yes

Returns:

Parameter Type Description
name str The name of the job run
clusterUUID Optional[Union[UUID, None]] The cluster UUID on which the job run was executed
cluster Optional[JobCluster] The cluster on which the job run was executed
type JobRunType The type of Job Run
config JobConfig The JobConfig of the cluster
submittedAt datetime The time at which the job was submitted
finishedAt Optional[Union[datetime, None]] The time at which the job finished
startedAt Optional[Union[datetime, None]] The time at which job started
status JobRunStatus Job run status
batchJobDefinitionConfigOverrides Optional[JobConfigOverride] The job run config override
batch_job_definition Optional[BatchJobDefinitionUUID] UUID of batch job definition
reason Optional[str] The reason for job statue change
submitter Optional[str] Email of the person who submitted the job

Gets specific batch job run by job_run_id

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id="XYZ",
    secret_key="XYZ"
)

client = get_bodo_client(keys)
jobrun = client.job.get_batch_job_run('04412S5b-300e-42db-84d4-5f22f7506594')


Cancel batch job run

BodoClient.job.cancel_batch_job_run(uuid: Union[str, UUID])

Cancels specific batch job run by job_run_id

Parameters:

Parameter Type Description Required
uuid uuid: Union[str, UUID] UUID of the job run Yes

Returns:

None returned on success

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id="XYZ",
    secret_key="XYZ"
)

client = get_bodo_client(keys)
client.job.cancel_batch_job_run('04412S5b-300e-42db-84d4-5f22f7506594')


Cancel all job runs on a cluster UUIDs

BodoClient.job.cancel_all_job_runs(cluster_uuids: Union[List[str], List[UUID]]) Cancels all the job runs for a set of cluster UUIDs provided as a function parameter

Parameters:

Parameter Type Description Required
cluster_uuids Union[List[str], List[UUID]] Cancels all job runs on a list of cluster UUIDs Yes

Returns:

None returned on success

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id="XYZ",
    secret_key="XYZ"
)

client = get_bodo_client(keys)
client.job.cancel_all_job_runs(['04412S5b-300e-42db-84d4-5f22f7506594'])

Check batch job run status

BodoClient.job.check_job_run_status(batch_job_run_uuid: Union[str, UUID])->JobRunStatus

Checks status of specific batch job run by id.

Parameters:

Parameter Type Description Required
batch_job_run_uuid Union[str, UUID] Fetches the Yes

Returns:

JobRunStatus returned on success

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id="XYZ",
    secret_key="XYZ"
)

client = get_bodo_client(keys)
status = client.job.check_job_run_status('04412S5b-300e-42db-84d4-5f22f7506594')


Submit SQL job run

BodoClient.job.submit_sql_job_run(create_sql_job_run: CreateSQLJobRun)

Submits a SQL query as a job run. The SQL job run contains the SQL query text and the cluster UUID on which the query will be executed along with catalog and the query tags. The query tags field accepts a JSON which returns the query tags associated with a query.

Note

This needs a database catalog to be configured in the workspace.

Parameters(CreateSQLJobRun):

Parameter Type Description Required Default
Job Run Type JobRunType Type of Job Run Yes None
clusterUUID str The cluster UUID on which the job run will be executed No None
catalog str The catalog which the query use Yes None
sql_query_text str The SQL query text Yes None
query_tags Dict[str, str] The query tags associated with the query No {}
timeout int The timeout for the query in minutes No 60
env_vars Dict[str, str] The environment variables for the query No {}
retry_strategy RetryStrategy The retry strategy for the query No None
cluster_config JobClusterDefinition Job Cluster Configuration No None
args Union[str, Dict] The arguments for the query No {}

Returns:

Parameter Type Description
name str The name of the job run
clusterUUID Optional[Union[UUID, None]] The cluster UUID on which the job run was executed
cluster Optional[JobCluster] The cluster on which the job run was executed
type JobRunType The type of Job Run
config JobConfig The JobConfig of the cluster
submittedAt datetime The time at which the job was submitted
finishedAt Optional[Union[datetime, None]] The time at which the job finished
startedAt Optional[Union[datetime, None]] The time at which job started
status JobRunStatus Job run status
batchJobDefinitionConfigOverrides Optional[JobConfigOverride] The job run config override
batch_job_definition Optional[BatchJobDefinitionUUID] UUID of batch job definition
reason Optional[str] The reason for job statue change
submitter Optional[str] Email of the person who submitted the job

Example:

from bodosdk.models import WorkspaceKeys, CreateSQLJobRun
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id="XYZ",
    secret_key="XYZ"
)

client = get_bodo_client(keys)

job_run = client.job.submit_sql_job_run(CreateSQLJobRun(
            clusterUUID='04412S5b-300e-42db-84d4-5f22f7506594',
            catalog="SNOWFLAKE_CATALOG",
            sql_query_text="SELECT * FROM PUBLIC.TABLE LIMIT 10",
            query_tags={"DAG_ID":"398482", "MACHINE_ID": "1934"}))


Job Run waiter

BodoClient.job.get_job_run_waiter()

Returns a waiter object that waits until the job run uuid specified finishes.

waiter.wait()

To wait for job run to be finished, invoke the waiter.wait() function, which can take the following parameters.

Parameters:

Parameter Type Description Required
uuid str Create a SQL Job Run based on the parameters that are passed Yes
on_success Callable Callable executed on success with job object passed as argument No
on_failure Callable Callable executed on failure with job object passed as argument No
on_timeout Callable Callable executed on failure with job_uuid passed as argument No
check_period int Time in seconds between status checks for the wait function. Default is 10 seconds No
timeout int Time in seconds after which timeout error will be raised. Default is none No

Returns:

Parameter Type Description
name str The name of the job run
clusterUUID Optional[Union[UUID, None]] The cluster UUID on which the job run was executed
cluster Optional[JobCluster] The cluster on which the job run was executed
type JobRunType The type of Job Run
config JobConfig The JobConfig of the cluster
submittedAt datetime The time at which the job was submitted
finishedAt Optional[Union[datetime, None]] The time at which the job finished
startedAt Optional[Union[datetime, None]] The time at which job started
status JobRunStatus Job run status
batchJobDefinitionConfigOverrides Optional[JobConfigOverride] The job run config override
batch_job_definition Optional[BatchJobDefinitionUUID] UUID of batch job definition
reason Optional[str] The reason for job statue change
submitter Optional[str] Email of the person who submitted the job
from typing import Callable
def wait(
        self,
        uuid,
        on_success: Callable = None,
        on_failure: Callable = None,
        on_timeout: Callable = None,
        check_period=10,
        timeout=None
):
  pass

By default, returns job model if no callbacks is provided. There is option to pass callable objects as following parameters:

Example 1. Success/Failure callbacks:

from bodosdk.models import WorkspaceKeys, CreateJobRun
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
input_job = CreateJobRun(clusterUUID='<cluster-uuid>', batchJobDefinitionUUID='<batch-job-definition-uuid>')
job_run = client.job.submit_batch_job_run(input_job)

waiter = client.job.get_job_run_waiter()

def success_callback(job):
    print("in success callback", job.status)

def failure_callback(job):
    print('in failure callback', job.status)

result = waiter.wait(job_run.uuid, on_success=success_callback, on_failure=failure_callback)

Example 2. Timeout callback:

from bodosdk.models import WorkspaceKeys, CreateJobRun
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
input_job = CreateJobRun(clusterUUID='<cluster-uuid>', batchJobDefinitionUUID='<batch-job-definition-uuid>')
job_run = client.job.submit_batch_job_run(input_job)

waiter = client.job.get_job_run_waiter()

def timeout_callback(job_uuid):
    print(f'Waiter timeout for {job_uuid}')
    return job_uuid


result = waiter.wait(job_run.status, on_timeout=timeout_callback, timeout=1)

Get job logs

BodoClient.job.get_job_logs(job_uuid) Returns specific stdout and stderr urls along with expiration timestamp in workspace. Also, downloads specific stdout and stderr logs in workspace and additionally provides links as well

  • stdout: Standard output of the program execution
  • stderr: Standard error messages of the program execution

Downloads files as below and overrides if they already exist

  • stdout_{uuid}.txt
  • stderr_{uuid}.txt

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.models.job import JobRunLogsResponse
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
logs: JobRunLogsResponse = client.job.get_job_logs('8c32aec5-7181-45cc-9e17-8aff35fd269e')

Clusters

Module responsible for managing clusters in workspace.

Availability Zone Selection

When creating a cluster, you can specify the availability zone in which the cluster will be created. However, cluster creation might fail if the availability zone does not have sufficient capacity to create the cluster. Even after the cluster is created, resuming or scaling it might fail if the availability zone does not have sufficient capacity to resume or scale the cluster.

Bodo supports an auto_az flag in cluster creation which is by default set to True. When enabled create, scale and resume tasks attempt to automatically select an availability zone with sufficient capacity for said cluster. If you want to disable this behavior, set auto_az to False in the ClusterDefinition object.

Available instance types

BodoClient.cluster.get_available_instance_types(region:str) -> Dict[str, InstanceCategory]

Parameters:

Parameter Type Description Required
region str Azure / AWS region Yes

Returns: Return dictionary of instance types available for given region grouped by instance category.

InstanceCategory:

Field Type Description
name str Name of instance instance category
instance_types Dict[str, InstanceType] Dict with all instances in the category
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
instance_types = client.cluster.get_available_instance_types('us-west-2')

Available images

BodoClient.cluster.get_available_images(region:str) -> Dict[str, BodoImage]

Parameters:

Parameter Type Description Required
region str Azure / AWS region Yes

Returns: Return dictionary of images available for given region where key is bodo version and value is BodoImage:

BodoImage:

Field Type Description
image_id str Image id
bodo_version str Bodo version
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
images = client.cluster.get_available_images('us-west-2')

Create cluster

BodoClient.cluster.create(cluster_definition: ClusterDefinition) -> ClusterResponse

Creates a cluster in the workspace based on the instance type, no of workers and whether the instance is a spot instance. The cluster can be configured to have an auto-pause and auto-stop time in minutes to pause and stop the cluster when there is no activity.

Important

If you choose to create a cluster with spot instances, please note:

  • Spot instance clusters are only supported on AWS at this moment.
  • Spot instance has lower cost at the expense of reliability. We recommend to use instance types with lower reclaim rate according to AWS spot instance advisor.
  • Spot instance clusters cannot be paused/resumed. Please use stop/restart instead
  • Auto pause on spot instance clusters is not allowed. Please use auto stop instead
  • Spot instance clusters will have a 60-minute auto stop by default to avoid accidental long-running clusters

Parameter: cluster_definition: ClusterDefinition

Field Type Description Required
name str Cluster name Yes
instance_type str Instance type of cluster nodes Yes
workers_quantity int Number of nodes in the cluster Yes
bodo_version str Bodo version available on the cluster Yes
description Optional[str] Cluster description No
image_id Optional[str] Image id used on cluster nodes No
auto_stop Optional[int] Cluster auto stop value [min] No
auto_pause Optional[int] Cluster auto pause value [min] No
availability_zone Optional[str] Availability zone for cluster No
aws_deployment_subnet_id Optional[str] [AWS] Subnet id for cluster No
instance_role_uuid Optional[str] [AWS] Instance role No
auto_az Optional[bool] [AWS] Whether the cluster has Auto AZ enabled No
use_spot_instance Optional[bool] [AWS] Whether the cluster uses spot instance No
is_job_dedicated Optional[bool] whether the cluster is a job dedicated cluster No
accelerated_networking Optional[bool] Whether the cluster uses accelerated networking (e.g. EFA on AWS) No

Returns: Object ClusterResponse:

Field Type Description
name str Cluster name
uuid Union[str, UUID] Cluster UUID
status ClusterStatus Cluster status
description Optional[str] Cluster description
instance_type str Instance type of cluster nodes
workers_quantity int Number workers in clusters
auto_stop Optional[int] Cluster auto stop value [min]
auto_pause Optional[int] Cluster auto pause value [min]
bodo_version Optional[str] Bodo version used in cluster VMs
image_id Optional[str] Image id used cluster VMs
cores_per_worker Optional[int] Number of cores per worker
accelerated_networking Optional[bool] Whether the cluster uses accelerated networking
created_at str Date when cluster was created
last_known_activity Optional[str] Date of last known cluster activity
is_job_dedicated Optional[bool] whether the cluster is a job dedicated cluster
node_metadata Optional[object] Metada data about nodes (IP, instance id)
asg_metadata Optional[object] [AWS] Auto scaling group metadata
aws_deployment_subnet_id Optional[str] [AWS] Subnet id for cluster
auto_az Optional[bool] [AWS] Whether the cluster has Auto AZ enabled

Example: Create a regular cluster

from bodosdk.models import WorkspaceKeys, ClusterDefinition
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
cluster_definition = ClusterDefinition(
    name="test",
    instance_type="c5.large",
    workers_quantity=2,
    use_spot_instance=False,
    auto_pause=100,
    image_id="ami-038d89f8d9470c862",
    bodo_version="2022.4",
    description="my desc here",
    auto_az=False,
)
result_create = client.cluster.create(cluster_definition)

Example: Create a spot instance cluster:

from bodosdk.models import WorkspaceKeys, ClusterDefinition
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
cluster_definition = ClusterDefinition(
    name="test-spot",
    instance_type="c5.large",
    workers_quantity=2,
    use_spot_instance=True,
    auto_stop=100,
    image_id="ami-038d89f8d9470c862",
    bodo_version="2022.4",
    description="my desc here",
    auto_az=False,
)
result_create = client.cluster.create(cluster_definition)

List clusters

BodoClient.cluster.list(page: int, page_size: int) -> ClusterList

Parameters:

Parameter Type Description Required
page int The page number has a lower limit of 1, default value 1 False
size int The size has a maximum allowed value of 100 and default value of 10 False

Returns: ClusterList

Fields Type Description
data List[ClusterResponse] Contains a list of Cluster Response
metadata PageMetadata Contains fields page, size, total_pages, total_items, and order

Object ClusterResponse:

Field Type Description
name str Cluster name
uuid Union[str, UUID] Cluster UUID
status ClusterStatus Cluster status
description Optional[str] Cluster description
instance_type str Instance type of cluster nodes
workers_quantity int Number workers in clusters
auto_stop Optional[int] Cluster auto stop value [min]
auto_pause Optional[int] Cluster auto pause value [min]
bodo_version Optional[str] Bodo version used in cluster VMs
image_id Optional[str] Image id used cluster VMs
cores_per_worker Optional[int] Number of cores per worker
accelerated_networking Optional[bool] Whether the cluster uses accelerated networking
created_at str Date when cluster was created
last_known_activity Optional[str] Date of last known cluster activity
is_job_dedicated Optional[bool] whether the cluster is a job dedicated cluster
node_metadata Optional[object] Metada data about nodes (IP, instance id)
asg_metadata Optional[object] [AWS] Auto scaling group metadata
aws_deployment_subnet_id Optional[str] [AWS] Subnet id for cluster
auto_az Optional[bool] [AWS] Whether the cluster has Auto AZ enabled

Example: Get the first 10 clusters in the workspace using the default parameter values.

from bodosdk.models import WorkspaceKeys, ClusterResponse
from bodosdk.client import get_bodo_client
from typing import List

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
clusters: List[ClusterResponse] = client.cluster.list().data

Example: Get all clusters with chunck of 10 clusters.

from bodosdk.client import get_bodo_client
from bodosdk.models import (
    WorkspaceKeys,
)

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)

client = get_bodo_client(keys)

cluster_list = []

page = 1
size = 10

clusters = client.cluster.list(page=page, page_size=size)

print('Total Clusters', clusters.metadata.total_items)

total_items = clusters.metadata.total_items
cluster_list = clusters.data

while len(cluster_list) < total_items:
    page += 1
    clusters = client.cluster.list(page=page, page_size=size)
    cluster_list += clusters.data

Get cluster

BodoClient.cluster.get(uuid : str) -> ClusterResponse

Parameters:

Parameter Type Description Required
uuid Union[str, UUID] Cluster UUID True

Returns: Object ClusterResponse:

Field Type Description
name str Cluster name
uuid Union[str, UUID] Cluster UUID
status ClusterStatus Cluster status
description Optional[str] Cluster description
instance_type str Instance type of cluster nodes
workers_quantity int Number workers in clusters
auto_stop Optional[int] Cluster auto stop value [min]
auto_pause Optional[int] Cluster auto pause value [min]
bodo_version Optional[str] Bodo version used in cluster VMs
image_id Optional[str] Image id used cluster VMs
cores_per_worker Optional[int] Number of cores per worker
accelerated_networking Optional[bool] Whether the cluster uses accelerated networking
created_at str Date when cluster was created
last_known_activity Optional[str] Date of last known cluster activity
is_job_dedicated Optional[bool] whether the cluster is a job dedicated cluster
node_metadata Optional[object] Metada data about nodes (IP, instance id)
asg_metadata Optional[object] [AWS] Auto scaling group metadata
aws_deployment_subnet_id Optional[str] [AWS] Subnet id for cluster
auto_az Optional[bool] [AWS] Whether the cluster has Auto AZ enabled

Example:

from bodosdk.models import WorkspaceKeys, ClusterResponse
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
clusters: ClusterResponse = client.cluster.get('<CLUSTER-UUID>')

Remove cluster

BodoClient.client.remove(uuid: Union[str, UUID], force_remove: bool = False, mark_as_terminated: bool = False)

Method removing cluster from platform

Parameters:

Parameter Type Description Required
uuid Union[str, UUID] Cluster UUID True
force_remove Union[str, UUID] Try to remove cluster even if there is cluster activity False
mark_as_terminated Union[str, UUID] Mark cluster as removed without removing resources, may be useful if cluster creation failed and normal removal is failing False

Returns: Returns None if successful. Otherwise, raises exception.

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from typing import List

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
client.cluster.remove('<CLUSTER-UUID>')

Scale cluster

BodoClient.cluster.scale(scale_cluster: ScaleCluster) -> ClusterResponse

Changes number of nodes in cluster (AWS only)

Parameter: scale_cluster: ScaleCluster

Field Type Description Required
uuid Union[str, UUID] Cluster UUID Yes
workers_quantity int Number of nodes in the cluster Yes

Returns: Object ClusterResponse:

Field Type Description
name str Cluster name
uuid Union[str, UUID] Cluster UUID
status ClusterStatus Cluster status
description Optional[str] Cluster description
instance_type str Instance type of cluster nodes
workers_quantity int Number workers in clusters
auto_stop Optional[int] Cluster auto stop value [min]
auto_pause Optional[int] Cluster auto pause value [min]
bodo_version Optional[str] Bodo version used in cluster VMs
image_id Optional[str] Image id used cluster VMs
cores_per_worker Optional[int] Number of cores per worker
accelerated_networking Optional[bool] Whether the cluster uses accelerated networking
created_at str Date when cluster was created
last_known_activity Optional[str] Date of last known cluster activity
is_job_dedicated Optional[bool] whether the cluster is a job dedicated cluster
node_metadata Optional[object] Metada data about nodes (IP, instance id)
asg_metadata Optional[object] [AWS] Auto scaling group metadata
aws_deployment_subnet_id Optional[str] [AWS] Subnet id for cluster
auto_az Optional[bool] [AWS] Whether the cluster has Auto AZ enabled

Example:

from bodosdk.models import WorkspaceKeys, ScaleCluster, ClusterResponse
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
NEW_WORKERS_QUANTITY = 3
scale_cluster = ScaleCluster(
    uuid='<CLUSTER-UUID>',
    workers_quantity=NEW_WORKERS_QUANTITY
)
cluster: ClusterResponse = client.cluster.scale(scale_cluster)

Stop cluster

BodoClient.cluster.stop(uuid: Union[str, UUID])

Stops any cluster activity. You will not incur any charges for stopped cluster. You can restart it again at any time.

Parameters:

Parameter Type Description Required
uuid Union[str, UUID] Cluster UUID Yes

Returns: Returns None if successful. Otherwise, raises exception.

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)

client = get_bodo_client(keys)
client.cluster.stop('<CLUSTER-UUID>')

Restart cluster

BodoClient.cluster.restart(uuid: Union[str, UUID])

Restarts cluster. You can restart cluster only if it is stopped.

Parameters:

Parameter Type Description Required
uuid Union[str, UUID] Cluster UUID Yes

Returns: Returns None if successful. Otherwise, raises exception.

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)

client = get_bodo_client(keys)
client.cluster.stop('<CLUSTER-UUID>')

Pause cluster

BodoClient.cluster.pause(uuid: Union[str, UUID])

Pause cluster. You can pause cluster only if it is running.

Parameters:

Parameter Type Description Required
uuid Union[str, UUID] Cluster UUID Yes

Returns: Returns None if successful. Otherwise, raises exception.

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)

client = get_bodo_client(keys)
client.cluster.pause('<CLUSTER-UUID>')

Resume cluster

BodoClient.cluster.resume(uuid: Union[str, UUID])

Resume cluster. You can resume cluster only if it is paused.

Parameters:

Parameter Type Description Required
uuid Union[str, UUID] Cluster UUID Yes

Returns: Returns None if successful. Otherwise, raises exception.

Example:

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)

client = get_bodo_client(keys)
client.cluster.resume('<CLUSTER-UUID>')

Modify Cluster metadata

BodoClient.cluster.modify(modify_cluster: ModifyCluster) -> ClusterResponse

This function can be used to edit cluster metadata for a given cluster. The properties that we can edit are description, autopause time, autostop time, bodo-version, instance type, instance role, flag for auto availability zone selection and the number of workers. Changing the number of workers will kick off a scaling event on the cluster, which will resume the cluster if it is in paused state. The modify function also supports modifying a subset of property part if the ModifyCluster object like listed in the example below. The cluster modification can only happen when the cluster is in stopped state. The fields that aren't required to be modified are optional and don't necessarily have to be passed during the call to the API.

Note

Disabling the auto_az flag without specifying an availability_zone in the same request might result in the cluster failing. So make sure to provide a fallback zone to avoid failures.

Parameter: modify_cluster: ModifyCluster

Field Type Description Required
uuid Union[str, UUID] Cluster UUID Yes
auto_stop Optional[int] Cluster auto stop value [min] No
auto_pause Optional[int] Cluster auto pause value [min] No
description Optional[str] Cluster description No
workers_quantity Optional[int] Number of nodes in the cluster No
instance_role_uuid Optional[str] [AWS] Instance role No
instance_type Optional[str] Instance type used in cluster No
bodo_version Optional[str] Bodo version available on the cluster No
auto_az Optional[bool] [AWS] Whether the cluster is Auto AZ enabled No
availability_zone Optional[str] Availability zone for the cluster No

Returns: Object ClusterResponse:

Field Type Description
name str Cluster name
uuid Union[str, UUID] Cluster UUID
status ClusterStatus Cluster status
description Optional[str] Cluster description
instance_type str Instance type of cluster nodes
workers_quantity int Number workers in clusters
auto_stop Optional[int] Cluster auto stop value [min]
auto_pause Optional[int] Cluster auto pause value [min]
bodo_version Optional[str] Bodo version used in cluster VMs
image_id Optional[str] Image id used cluster VMs
cores_per_worker Optional[int] Number of cores per worker
accelerated_networking Optional[bool] Whether the cluster uses accelerated networking
created_at str Date when cluster was created
last_known_activity Optional[str] Date of last known cluster activity
is_job_dedicated Optional[bool] whether the cluster is a job dedicated cluster
node_metadata Optional[object] Metada data about nodes (IP, instance id)
asg_metadata Optional[object] [AWS] Auto scaling group metadata
aws_deployment_subnet_id Optional[str] [AWS] Subnet id for cluster
auto_az Optional[bool] [AWS] Whether the cluster has Auto AZ enabled

Example:

from bodosdk.models import WorkspaceKeys, ModifyCluster, ClusterResponse
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)

role_definition = CreateRoleDefinition(
  name="test-sdk-role-creation",
  description="testing-instance-role-creation",
  data=InstanceRole(role_arn="arn:aws:iam::427443013497:role/testing_bucket_with_my_script")
)
result_create_role: CreateRoleResponse = client.instance_role.create(role_definition)

client = get_bodo_client(keys)
modify_cluster = ModifyCluster(
    uuid="<cluster-uuid>",
    auto_pause=60,
    auto_stop=0,
    workers_quantity=4,
    description="using the SDK",
    instance_type="c5.large",
    instance_role_uuid=result_create_role.uuid,
    bodo_version="2022.4",
    auto_az=True,
)
partial_modify_cluster = ModifyCluster(
    uuid="<cluster-uuid>",
    autopause=120,
)
new_cluster: List[ClusterResponse] = client.cluster.modify(modify_cluster)
new_cluster_partial: List[ClusterResponse] = client.cluster.modify(partial_modify_cluster)

Example: Detach Custom Instance Role . Replace the custom instance role with default role which is automatically created for a cluster

detach_custom_instance_role = ModifyCluster(
    uuid="<cluster-uuid>",
    instance_role_uuid="default",
)
new_cluster_partial: ClusterResponse = client.cluster.modify(detach_custom_instance_role)

List tasks for a cluster

BodoClient.cluster.list_tasks(uuid: Union[str, UUID]) -> List[TaskInfo]

Gets all taks for cluster.

Parameters:

Parameter Type Description Required
uuid Union[str, UUID] Cluster UUID Yes

Returns: List[TaskInfo] Object TaskInfo:

Field Type Description
uuid str Cluster task UUID
status TaskStatus Cluster task status
task_type str Task type
logs str Cluster description

Example:

from bodosdk.models import WorkspaceKeys, TaskInfo
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
tasks: List[TaskInfo] = client.cluster.list_tasks(uuid)

Workspaces

Module responsible for managing workspaces in an organization

Workspace getting started

In order to work with Workspace, users need to generate Personal Tokens, under Admin Console, from the Bodo Platform Dashboard. Then instantiate a PersonalKeys object with the generated client_id and secret_id. Then pass in this personal key while instantiating a client object

from bodosdk.models import PersonalKeys
personal_keys = PersonalKeys(
    client_id='<CLIENT-ID>',
    secret_id='<SECRET-ID>',
)
client = get_bodo_organization_client(personal_keys)

Create Workspace

BodoClient.workspace.create(workspace_definition: WorkspaceDefinition) -> WorkspaceCreatedResponse

Creates a workspace with the specifications passed in through a WorkspaceDefinition object under the user's organization

Parameter: workspace_definition: WorkspaceDefinition

Field Type Description Required
name str Workspace name Yes
region str Region where workspace should be placed Yes
cloud_config_uuid Union[UUID, str] Existing AWS or Azure cloud config uuid Yes
storage_endpoint Optional[bool] Enable additonal endpoints Microsoft.Storage or S3 Gateway.
There's no additional charge for using service endpoints.
No
aws_network_data Optional[AWSNetworkData] Specific for network data for AWS No

Returns: Object WorkspaceCreatedResponse:

Field Type Description
name str Workspace name
uuid str Workspace UUID
status str Workspace status
region str Region where workspace should be placed.
organization_uuid str Organization UUID
data Optional[Any] Specific configuration data for workspace
created_by Optional[str] Email address of the workspace creator
cloud_config Optional[Any] Cloud config data

Example: Create new Workspace

from bodosdk.models import PersonalKeys
from bodosdk.models import WorkspaceDefinition
personal_keys = PersonalKeys(
    client_id='<CLIENT-ID>',
    secret_id='<SECRET-ID>',
)
client = get_bodo_organization_client(personal_keys)
wd = WorkspaceDefinition(
    name="<WORSPACE-NAME>",
    cloud_config_uuid="<CONFIG-UUID>",
    region="<WORKSPACE-REGION>"
)
resp = client.workspace.create(wd)

List Workspaces

BodoClient.workspace.list()

Returns: Returns a list of all workspaces defined under this organization as List[WorkspaceListItem]

Field Type Description
name str Workspace name
uuid str Workspace UUID
status WorkspaceStatus Workspace status
provider str Workspace provider (AWS or AZURE)
region str Region where workspace should be placed.
organization_uuid str Organization UUID
data Optional[Any] Specific configuration data for workspace
server_time Optional[str] Current server time
cloud_config Optional[Any] Cloud config data
created_by Optional[str] Email address of the workspace creator

Example:

from bodosdk.models import PersonalKeys
personal_keys = PersonalKeys(
client_id='<CLIENT-ID>',
secret_id='<SECRET-ID>',
)
client = get_bodo_organization_client(personal_keys)
resp = client.workspace.list()

Get Workspace

BodoClient.workspace.get(uuid: Union[str, UUID]) -> WorkspaceResponse

Returns information about the workspace with the given uuid. Returns a GetWorkspaceResponse object with details about the workspace uuid mentioned.

Parameters:

Parameter Type Description Required
uuid Union[str, UUID] Workspace UUID Yes

Returns: Returns the workspace details provided the workspace uuid - WorkspaceResponse

Field Type Description
name str Workspace name
uuid str Workspace UUID
status WorkspaceStatus Workspace status
region str Region where workspace is placed.
organization_uuid str Organization UUID
data Optional[Any] Specific configuration data for workspace
cloud_config Optional[Any] Cloud config data
created_by Optional[str] Email address of the workspace creator

Example:

from bodosdk.models import PersonalKeys
personal_keys = PersonalKeys(
    client_id='<CLIENT-ID>',
    secret_id='<SECRET-ID>',
)
client = get_bodo_organization_client(personal_keys)
resp = client.workspace.get("<WORKSPACE-UUID>")

Remove Workspace

BodoClient.workspace.remove(uuid: Union[str, UUID], mark_as_terminated: bool = False)

Removes the workspace with the provided uuid. The operation is only successful if all resources within the workspaces(jobs, clusters, notebooks) are already terminated.

Parameters:

Field Type Description Required
uuid Union[str, UUID] Workspace UUID Yes
mark_as_terminated bool Mark role as terminated without removing resources,
may be useful if role creation failed and deletion is failing
No

Returns: Returns None if successful. Otherwise, raises exception.

Example:

from bodosdk.models import PersonalKeys
personal_keys = PersonalKeys(
    client_id='<CLIENT-ID>',
    secret_id='<SECRET-ID>',
)
client = get_bodo_organization_client(personal_keys)
resp = client.workspace.remove("<WORKSPACE-UUID>")

Assign user

BodoClient.workspace.assign_users(workspace_uuid: Union[str, UUID], users: List[UserAssignment])

Assign user to workspace.

Parameters:

Parameter Type Description Required
workspace_uuid Union[str, UUID] Workspace UUID Yes
users List[UserAssignment] List of users that will be assigned to a given workspace Yes

Returns: Returns None if successful. Otherwise, raises exception.

Example:

from bodosdk.models import PersonalKeys
personal_keys = PersonalKeys(
    client_id='<CLIENT-ID>',
    secret_id='<SECRET-ID>',
)
client = get_bodo_organization_client(personal_keys)
workspace_uuid = "<some uuid>"
users: List[UserAssignment] = [
    UserAssignment(
        email="example@example.com",
        skip_email=True,
        bodo_role=BodoRole.ADMIN
    )
]
client.workspace.assign_users(workspace_uuid, users)

List Workspace tasks

BodoClient.workspace.get_tasks(workspace_uuid: Union[str, UUID]) -> List[TaskInfo]

Returns: Return a list of workspace tasks in the workspace - List[TaskInfo]

Field Type Description
uuid str Workspace task uuid
status TaskStatus Status of workspace task
task_type WorkspaceStatus Type of workspace task
logs str Logs from specific task

Example

from bodosdk.models import PersonalKeys
personal_keys = PersonalKeys(
    client_id='<CLIENT-ID>',
    secret_id='<SECRET-ID>',
)
client = get_bodo_organization_client(personal_keys)
resp = client.workspace.get_tasks("<WORKSPACE-UUID>")

Cloud Config

Module responsible for creating cloud configurations for organization.

Create config

BodoClient.cloud_config.create(config: Union[CreateAwsCloudConfig, CreateAzureCloudConfig])

Create cloud configuration for cloud

AWS example

from bodosdk.models import OrganizationKeys, CreateAwsProviderData, CreateAwsCloudConfig, AwsCloudConfig
from bodosdk.client import get_bodo_client

keys = OrganizationKeys(
    client_id='XYZ',
    secret_key='XYZ'
)

client = get_bodo_client(keys)

config = CreateAwsCloudConfig(
    name='test',
    aws_provider_data=CreateAwsProviderData(
        tf_backend_region='us-west-1',
        access_key_id='xyz',
        secret_access_key='xyz'
    )

)
config: AwsCloudConfig = client.cloud_config.create(config)

Azure example

from bodosdk.models import OrganizationKeys, CreateAzureProviderData, CreateAzureCloudConfig, AzureCloudConfig
from bodosdk.client import get_bodo_client

keys = OrganizationKeys(
    client_id='XYZ',
    secret_key='XYZ'
)

client = get_bodo_client(keys)

config = CreateAzureCloudConfig(
    name='test',
    azure_provider_data=CreateAzureProviderData(
        tf_backend_region='eastus',
        tenant_id='xyz',
        subscription_id='xyz',
        resource_group='MyResourceGroup'
    )

)
config: AzureCloudConfig = client.cloud_config.create(config)

Get config

BodoClient.cloud_config.list()

Get list of cloud configs.

from bodosdk.models import OrganizationKeys, AzureCloudConfig, AwsCloudConfig
from bodosdk.client import get_bodo_client
from typing import Union, List

keys = OrganizationKeys(
    client_id='XYZ',
    secret_key='XYZ'
)

client = get_bodo_client(keys)

configs: List[Union[AwsCloudConfig, AzureCloudConfig]] = client.cloud_config.list()

Get config

BodoClient.cloud_config.get(uuid: Union[str, UUID])

Get cloud config by uuid.

from bodosdk.models import OrganizationKeys, AzureCloudConfig, AwsCloudConfig
from bodosdk.client import get_bodo_client
from typing import Union

keys = OrganizationKeys(
    client_id='XYZ',
    secret_key='XYZ'
)

client = get_bodo_client(keys)

config: Union[AwsCloudConfig, AzureCloudConfig] = client.cloud_config.get('8c32aec5-7181-45cc-9e17-8aff35fd269e')

Instance Role Manager

Module responsible for managing AWS roles in workspace.

Create role

BodoClient.instance_role.create()

Creates an AWS role with the specified role definition with a given AWS role arn.

from bodosdk.models import WorkspaceKeys, CreateRoleDefinition, CreateRoleResponse
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
role_definition = CreateRoleDefinition(
    name="test-sdk-role-creation",
    description="testing",
    data=InstanceRole(role_arn="arn:aws:iam::1234567890:role/testing")
)
result_create:CreateRoleResponse = client.instance_role.create(role_definition)

List roles

BodoClient.instance_role.list()

Returns list of all roles in workspace

from bodosdk.models import WorkspaceKeys, InstanceRoleItem
from bodosdk.client import get_bodo_client
from typing import List

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
result_list:List[InstanceRoleItem] = client.instance_role.list()

Get role

BodoClient.instance_role.get(cluster_uuid)

Returns role by uuid

from bodosdk.models import WorkspaceKeys, InstanceRoleItem
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
clusters: InstanceRoleItem = client.instance_role.get('<CLUSTER-UUID>')

Remove role

BodoClient.instance_role.remove(cluster_uuid, mark_as_terminated=False)

Parameters Type Description Required
mark_as_terminated Boolean Mark role as terminated without removing resources,
may be useful if role creation failed and deletion is failing
No
from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client

keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
client.instance_role.remove('<ROLE-UUID>')

Catalog

Module responsible for storing database catalogs

Create Catalog

BodoClient.catalog.create()

Stores the Database Catalog

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.catalog import CatalogDefinition, SnowflakeConnectionDefinition
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)

# Type Support for Snowflake
snowflake_definition = SnowflakeConnectionDefinition(
    host="test.snowflake.com",
    port=443,
    username="test-username",
    password="password",
    database="test-db",
    warehouse="test-wh",
    role="test-role"
)

# For other databases, need to defined as JSON
connection_data = {
    "host": "test.db.com",
    "username": "test-username",
    "password": "*****",
    "database": "test-db",
}

catalog_definition = CatalogDefinition(
    name="catalog-1",
    description="catalog description",
    catalogType="SNOWFLAKE", # Currently Support Snowflake
    data=snowflake_definition
)

client.catalog.create(catalog_definition)

Get Catalog by UUID

BodoClient.catalog.get_catalog()

Retrieves the Catalog details by UUID

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.catalog import CatalogInfo
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
catalog_info: CatalogInfo = client.catalog.get("<CATALOG-UUID>")

Get Catalog by Name

BodoClient.catalog.get_by_name()

Retrieves the Catalog details by UUID

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.catalog import CatalogInfo
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
catalog_info: CatalogInfo = client.catalog.get_by_name("test-catalog")

List Catalogs

BodoClient.catalog.list()

Retrieves all catalogs in a workspace.

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.catalog import CatalogInfo
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
catalog_info: CatalogInfo = client.catalog.list()

Update Catalog

BodoClient.catalog.update()

Updates the Database Catalog

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.catalog import CatalogDefinition, SnowflakeConnectionDefinition
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)

# Type Support for Snowflake
snowflake_definition = SnowflakeConnectionDefinition(
    host="update.snowflake.com",
    port=443,
    username="test-username",
    password="password",
    database="test-db",
    warehouse="test-wh",
    role="test-role"
)

new_catalog_def = CatalogDefinition(
    name="catalog-1",
    description="catalog description",
    catalogType="SNOWFLAKE", # Currently Support Snowflake
    data=snowflake_definition
)
client.catalog.update("<CATALOG-UUID>", new_catalog_def)

Remove Catalog by UUID

BodoClient.catalog.remove()

Deletes a Database Catalog by UUID

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
client.catalog.remove("<CATALOG-UUID>")

Remove all Catalogs

BodoClient.catalog.remove()

Deletes a Database Catalog by UUID

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
client.catalog.remove_all()

Secret Groups

Module responsible for separating secrets into multiple groups.

A default secret group will be created at the time of workspace creation. Users can define custom secret groups using the following functions.

Create Secret Group

BodoClient.secret_group.create()

Create a secret group

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secret_group import SecretGroupDefinition
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)

secret_group_definition = SecretGroupDefinition(
    name="sg-1", # Name should be unique to that workspace
    description="secret group description",
)

client.secret_group.create(secret_group_definition)

List Secret Groups

BodoClient.secret_group.list()

List all the secret groups in a workspace.

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secret_group import SecretGroupInfo
from typing import List
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
groups_list: List[SecretGroupInfo] = client.secret_group.list()

Update Secret Group

BodoClient.secret_group.update()

Updates the secret group description

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secret_group import SecretGroupInfo, SecretGroupDefinition
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)

update_secret_group_def = SecretGroupDefinition(
    name="sg-1", # Cannot modify the name in the group
    description="secret group description",
)
groups_data: SecretGroupInfo = client.secret_group.update(update_secret_group_def)

Delete Secret Group

BodoClient.secret_group.remove()

Removes the secret group.

Note

You can only remove a secret group if all the secrets in the group are deleted.

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)

client.secret_group.remove("<secret-group-uuid>")

Secrets

Module responsible for creating secrets.

Create Secret

BodoClient.secrets.create()

Create the secret in a secret group.

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secrets import SecretDefinition
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)

secret_definition = SecretDefinition(
    name="secret-1",
    data={
        "key": "value"
    },
    secret_group="<secret-group-name>" #If not defined, defaults to default to secret group
)

client.secrets.create(secret_definition)

Get Secrets by UUID

BodoClient.secrets.get()

Retrieves the Secrets by UUID

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secrets import SecretInfo
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
secret_info: SecretInfo = client.secrets.get("<secret-uuid>")

List Secrets by Workspace

BodoClient.secrets.list()

List the secrets in a workspace

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secrets import SecretInfo
from typing import List
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
secrets_info: List[SecretInfo] = client.secrets.list()

List Secrets by Secret Group

BodoClient.secrets.list_by_group()

List the Secrets by Secret Group

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secrets import SecretInfo
from typing import List
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
secrets_info: List[SecretInfo] = client.secrets.list_by_group("<secret-group-name>")

Update Secret

BodoClient.secrets.update()

Updates the secret.

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secrets import SecretDefinition
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)

update_secret_def = SecretDefinition(
    data={
        "key": "value"
    }
)

client.secrets.update("<secret-uuid>", update_secret_def)

Delete Secrets by UUID

BodoClient.secrets.remove()

Delete the Secret by UUID

from bodosdk.models import WorkspaceKeys
from bodosdk.client import get_bodo_client
from bodosdk.models.secrets import SecretInfo
keys = WorkspaceKeys(
    client_id='XYZ',
    secret_key='XYZ'
)
client = get_bodo_client(keys)
secret_info: SecretInfo = client.secrets.remove("<secret-uuid>")

Billing module

Billing module provides access to billing information related to a particular workspace.

Get job run billing report CSV

BodoClient.billing.get_job_run_price_export(started_at: str, finished_at: str, workspace_uuid: Union[str, UUID])

Provides a CSV download link for the billing report, specifically on the job run level, displaying EC2 costs per job run within the defined startTime and endTime range for all workspaces.

To get the billing report for a particular workspace, you need the workspaceUUID which can be obtained from list-workspaces.

The billing report includes essential fields such as start time, end time, duration, worker count, instance type, and associated costs. This link remains active for a duration of 7 days and is exclusively available to AWS customers.

Important

Reports can only be generated for a 30-day time period.

Parameters:

Parameters Type Description Required
started_at Union[str, date] Start date of the report Yes
finished_at Union[str, date] End date of the report Yes
workspace_uuid Union[str, UUID] Workspace UUID, returns all workspaces in organization when null No

Returns:

Fields Type Description
url string S3 Pre-signed URL which contains the report


CSV report for date range

The following Python code generates a CSV report for a specified date range, starting from September 5th to September 6th for a given workspace UUID. . You can also provide the same date range with a timestamp in the ISO8601 format, such as 2023-09-05T11:15:00Z. After running the code, it will display a link to the CSV report that can be clicked to initiate the download:

from bodosdk.models import OrganizationKeys
from bodosdk.client import get_bodo_organization_client

keys = OrganizationKeys(
    client_id="XYZ",
    secret_key="XYZ"
)

client = get_bodo_organization_client(keys)

print(client.billing.get_job_run_price_export('2023-09-05', '2023-09-06', 'WORKSPACE-UUID'))

Get cluster level billing report CSV

BodoClient.billing.get_cluster_price_export(started_at: str, finished_at: str, workspace_uuid: Union[str, UUID])

Provides a CSV download link for the billing report, specifically at the cluster level, displaying EC2 costs per cluster run within the defined startTime and endTime range among all workspaces.
For a particular workspace, provide the workspaceUUID which can be obtained from list-workspaces

The billing report includes essential fields such as start time, end time, duration, worker count, instance type, and associated costs. This link remains active for a duration of 7 days and is exclusively available to AWS customers.

Important

Reports can only be generated for a 30-day time period.

Parameters:

Parameters Type Description Required
started_at Union[str, date] Start date of the report Yes
finished_at Union[str, date] End date of the report Yes
workspace_uuid Union[str, UUID] Workspace UUID,
returns all workspaces in organization when null
No

Returns:

Fields Type Description
url string S3 Pre-signed URL which contains the report


Example: CSV report for a given date range:

The following Python code generates a CSV report for a specified date range, starting from September 5th to September 6th for a given workspace UUID. You can also provide the same date range with a timestamp in the ISO8601 format, such as 2023-09-05T11:15:00Z. After running the code, it will display a link to the CSV report that can be clicked to initiate the download:

from bodosdk.models import OrganizationKeys
from bodosdk.client import get_bodo_organization_client

keys = OrganizationKeys(
    client_id="XYZ",
    secret_key="XYZ"
)

client = get_bodo_organization_client(keys)

print(client.billing.get_cluster_price_export('2023-09-05', '2023-09-06', 'WORKSPACE-UUID'))