BodoSQLContext API¶
The BodoSQLContext
API is the primary interface for executing SQL queries. It performs two roles:
- Registering data and connection information to load tables of interest.
- Forwarding SQL queries to the BodoSQL engine for compilation and execution. This is done via the
bc.sql(query)
method, wherebc
is aBodoSQLContext
object.
A BodoSQLContext
can be defined in regular Python and passed as an argument to JIT functions or can be
defined directly inside JIT functions. We recommend defining and modifying a BodoSQLContext
in regular
Python whenever possible.
For example:
bc = bodosql.BodoSQLContext(
{
"T1": bodosql.TablePath("my_file_path.pq", "parquet"),
},
catalog=bodosql.SnowflakeCatalog(
username,
password,
account_name,
warehouse_name,
database name,
)
)
@bodo.jit
def f(bc):
return bc.sql("select t1.A, t2.B from t1, catalogSchema.t2 where t1.C > 5 and t1.D = catalogSchema.t2.D")
API Reference¶
-
bodosql.BodoSQLContext(tables: Optional[Dict[str, Union[pandas.DataFrame|TablePath]]] = None, catalog: Optional[DatabaseCatalog] = None)
Defines a
BodoSQLContext
with the given local tables and catalog.Arguments
-
tables
: A dictionary that maps a name used in a SQL query to aDataFrame
orTablePath
object. -
catalog
: ADatabaseCatalog
used to load tables from a remote database (e.g. Snowflake).
-
-
bodosql.BodoSQLContext.sql(self, query: str, params_dict: Optional[Dict[str, Any] = None, distributed: list|set|bool = set(), replicated: list|set|bool = set(), **jit_options)
Executes a SQL query using the tables registered in this
BodoSQLContext
.Arguments
query
: The SQL query to execute. This function generates code that is compiled so thequery
argument is required to be a compile time constant.
-
params_dict
: A dictionary that maps a SQL usable name to Python variables. For more information please refer to the BodoSQL named parameters section. -
distributed
,replicated
, and other JIT options are passed to Bodo JIT. See Bodo distributed flags documentation for more details. Example code:df = pd.DataFrame({"A": np.arange(10), "B": np.ones(10)}) bc = bodosql.BodoSQLContext({"T1": df}) out_df = bc.sql("select sum(B) from T1 group by A", distributed=["T1"])
Returns
A
DataFrame
that results from executing the query. -
bodosql.BodoSQLContext.add_or_replace_view(self, name: str, table: Union[pandas.DataFrame, TablePath])
Create a new
BodoSQLContext
from an existingBodoSQLContext
by adding or replacing a table.Arguments
-
name
: The name of the table to add. If the name already exists references to that table are removed from the new context. -
table
: The table object to add.table
must be aDataFrame
orTablePath
object.
Returns
A new
BodoSQLContext
that retains the tables and catalogs from the oldBodoSQLContext
and inserts the new table specified.Note
This DOES NOT update the given context. Users should always use the
BodoSQLContext
object returned from the function call. e.g.bc = bc.add_or_replace_view("t1", table)
-
-
bodosql.BodoSQLContext.remove_view(self, name: str)
Creates a new
BodoSQLContext
from an existing context by removing the table with the given name. If the name does not exist, aBodoError
is thrown.Arguments
name
: The name of the table to remove.
Returns
A new
BodoSQLContext
that retains the tables and catalogs from the oldBodoSQLContext
minus the table specified.Note
This DOES NOT update the given context. Users should always use the
BodoSQLContext
object returned from the function call. e.g.bc = bc.remove_view("t1")
-
bodosql.BodoSQLContext.add_or_replace_catalog(self, catalog: DatabaseCatalog)
Create a new
BodoSQLContext
from an existing context by replacing theBodoSQLContext
object'sDatabaseCatalog
with a new catalog.Arguments
catalog
: The catalog to insert.
Returns
A new
BodoSQLContext
that retains tables from the oldBodoSQLContext
but replaces the old catalog with the new catalog specified.Note
This DOES NOT update the given context. Users should always use the
BodoSQLContext
object returned from the function call. e.g.bc = bc.add_or_replace_catalog(catalog)
-
bodosql.BodoSQLContext.remove_catalog(self)
Create a new
BodoSQLContext
from an existing context by removing itsDatabaseCatalog
.Returns
A new
BodoSQLContext
that retains tables from the oldBodoSQLContext
but removes the old catalog.Note
This DOES NOT update the given context. Users should always use the
BodoSQLContext
object returned from the function call. e.g.bc = bc.remove_catalog()