BodoSQLContext API¶
The BodoSQLContext API is the primary interface for executing SQL queries. It performs two roles:
- Registering data and connection information to load tables of interest.
- Forwarding SQL queries to the BodoSQL engine for compilation and execution. This is done via the
bc.sql(query)method, wherebcis aBodoSQLContextobject.
A BodoSQLContext can be defined in regular Python and passed as an argument to JIT functions or can be
defined directly inside JIT functions. We recommend defining and modifying a BodoSQLContext in regular
Python whenever possible.
For example:
bc = bodosql.BodoSQLContext(
{
"T1": bodosql.TablePath("my_file_path.pq", "parquet"),
},
catalog=bodosql.SnowflakeCatalog(
username,
password,
account_name,
warehouse_name,
database name,
)
)
@bodo.jit
def f(bc):
return bc.sql("select t1.A, t2.B from t1, catalogSchema.t2 where t1.C > 5 and t1.D = catalogSchema.t2.D")
API Reference¶
-
bodosql.BodoSQLContext(tables: Optional[Dict[str, Union[pandas.DataFrame|TablePath]]] = None, catalog: Optional[DatabaseCatalog] = None)Defines a
BodoSQLContextwith the given local tables and catalog.Arguments
-
tables: A dictionary that maps a name used in a SQL query to aDataFrameorTablePathobject. -
catalog: ADatabaseCatalogused to load tables from a remote database (e.g. Snowflake).
-
-
bodosql.BodoSQLContext.sql(self, query: str, params_dict: Optional[Dict[str, Any] = None, distributed: list|set|bool = set(), replicated: list|set|bool = set(), **jit_options)Executes a SQL query using the tables registered in this
BodoSQLContext.Arguments
query: The SQL query to execute. This function generates code that is compiled so thequeryargument is required to be a compile time constant.
-
params_dict: A dictionary that maps a SQL usable name to Python variables. For more information please refer to the BodoSQL named parameters section. -
distributed,replicated, and other JIT options are passed to Bodo JIT. See Bodo distributed flags documentation for more details. Example code:df = pd.DataFrame({"A": np.arange(10), "B": np.ones(10)}) bc = bodosql.BodoSQLContext({"T1": df}) out_df = bc.sql("select sum(B) from T1 group by A", distributed=["T1"])Returns
A
DataFramethat results from executing the query. -
bodosql.BodoSQLContext.add_or_replace_view(self, name: str, table: Union[pandas.DataFrame, TablePath])Create a new
BodoSQLContextfrom an existingBodoSQLContextby adding or replacing a table.Arguments
-
name: The name of the table to add. If the name already exists references to that table are removed from the new context. -
table: The table object to add.tablemust be aDataFrameorTablePathobject.
Returns
A new
BodoSQLContextthat retains the tables and catalogs from the oldBodoSQLContextand inserts the new table specified.Note
This DOES NOT update the given context. Users should always use the
BodoSQLContextobject returned from the function call. e.g.bc = bc.add_or_replace_view("t1", table) -
-
bodosql.BodoSQLContext.remove_view(self, name: str)Creates a new
BodoSQLContextfrom an existing context by removing the table with the given name. If the name does not exist, aBodoErroris thrown.Arguments
name: The name of the table to remove.
Returns
A new
BodoSQLContextthat retains the tables and catalogs from the oldBodoSQLContextminus the table specified.Note
This DOES NOT update the given context. Users should always use the
BodoSQLContextobject returned from the function call. e.g.bc = bc.remove_view("t1") -
bodosql.BodoSQLContext.add_or_replace_catalog(self, catalog: DatabaseCatalog)Create a new
BodoSQLContextfrom an existing context by replacing theBodoSQLContextobject'sDatabaseCatalogwith a new catalog.Arguments
catalog: The catalog to insert.
Returns
A new
BodoSQLContextthat retains tables from the oldBodoSQLContextbut replaces the old catalog with the new catalog specified.Note
This DOES NOT update the given context. Users should always use the
BodoSQLContextobject returned from the function call. e.g.bc = bc.add_or_replace_catalog(catalog) -
bodosql.BodoSQLContext.remove_catalog(self)Create a new
BodoSQLContextfrom an existing context by removing itsDatabaseCatalog.Returns
A new
BodoSQLContextthat retains tables from the oldBodoSQLContextbut removes the old catalog.Note
This DOES NOT update the given context. Users should always use the
BodoSQLContextobject returned from the function call. e.g.bc = bc.remove_catalog()