Getting started with Tabular Database Catalogs¶
Tabular is a universal data platform built around Iceberg. Bodo supports Tabular Database Catalogs, which allow you to read and write tables using the REST Iceberg Catalog provided by Tabular.
In this guide, we will walk you through the steps to create and use Tabular Database Catalogs on the Bodo Platform.
Prerequisites¶
- A Bodo Platform account with an active subscription is required.
- A Tabular account is required.
- Service credentials for the Tabular account are required. See the Tabular documentation for more information.
Creating a Tabular Database Catalog¶
Once you have a Tabular account, log into the Bodo Platform and navigate to the Catalogs section in the sidebar. Click on CREATE CATALOG and fill up the form with the required values.
A few things to note :
- The
Catalog Type
should be set toTabular
, this will automatically fetch the required fields for the catalog details. - The
Iceberg REST URL
field of the catalog form should be filled with the Iceberg REST URI for your Tabular account. Typically this will behttps://api.tabular.io/ws/
. If you are using a test account on tabular, the URI may behttps://api.test.tabular.io/ws/
. - The
Credential
field should be filled with the service credential you created in the Tabular account, which has the formatclientid:clientsecret
.
Upon submitting the form, you will see that your Catalog has been created and is now available to use.
Using Interactive Notebooks with Tabular Catalogs¶
First, create a cluster on the Bodo Platform to run the interactive notebook and attach the notebook to the cluster.
We created a tabular catalog with the following details:
- Catalog Name:
test-tabular
- Iceberg REST URL:
https://api.test.tabular.io/ws/
- Warehouse:
sandbox
First set the cell type to SQL and select the Tabular catalog from the catalog selector dropdown. The sql query you run will be executed on the tables using the Tabular catalog you selected.
We will run a simple query to read the table nyc_taxi_yellow
in the examples
database in our Tabular account.
Note that we had to use double quotes around the database and table names because Bodo SQL currently requires them to be quoted to be case-sensitive. See this section to learn more.
After running the query, the results are stored in a distributed dataframe named LAST_SQL_OUTPUT
.
You can now use the LAST_SQL_OUTPUT
dataframe in your code to perform further operations.
See Also