Supported DataFrame Data Types¶
BodoSQL uses Pandas DataFrames to represent SQL tables in memory and converts SQL types to corresponding Python types which are used by Bodo. Below is a table mapping SQL types used in BodoSQL to their respective Python types and Bodo data types.
SQL Type(s) | Equivalent Python Type | Bodo Data Type |
---|---|---|
TINYINT |
np.int8 |
bodo.int8 |
SMALLINT |
np.int16 |
bodo.int16 |
INT |
np.int32 |
bodo.int32 |
BIGINT |
np.int64 |
bodo.int64 |
FLOAT |
np.float32 |
bodo.float32 |
DECIMAL , DOUBLE |
np.float64 |
bodo.float64 |
VARCHAR , CHAR |
str |
bodo.string_type |
TIMESTAMP , DATE |
np.datetime64[ns] |
bodo.datetime64ns |
INTERVAL(day-time) |
np.timedelta64[ns] |
bodo.timedelta64ns |
BOOLEAN |
np.bool_ |
bodo.bool_ |
BodoSQL can also process DataFrames that contain Categorical or Date columns. However, Bodo will convert these columns to one of the supported types, which incurs a performance cost. We recommend restricting your DataFrames to the directly supported types when possible.
Nullable and Unsigned Types¶
Although SQL does not explicitly support unsigned types, by default, BodoSQL maintains the exact types of the existing DataFrames registered in a [BodoSQLContext], including unsigned and non-nullable type behavior. If an operation has the possibility of creating null values or requires casting data, BodoSQL will convert the input of that operation to a nullable, signed version of the type.
Supported Literals¶
BodoSQL supports the following literal types:
array_literal
boolean_literal
datetime_literal
float_literal
integer_literal
interval_literal
object_literal
string_literal
Array Literal¶
Syntax:
where <[>
and <]>
indicate literal [
and ]
s, and expr
is any expression.
Array literals are lists of comma seperated expressions wrapped in square brackets.
Note that BodoSQL currently only supports homogenous lists, and all expr
s
must coerce to a single type.
Boolean Literal¶
Syntax:
Boolean literals are case-insensitive.
Datetime Literal¶
Syntax:
Float Literal¶
Syntax:
where digit is any numeral from 0 to 9
Integer Literal¶
Syntax:
where digit is any numeral from 0 to 9
Interval Literal¶
Syntax:
Where integer_literal is a valid integer literal and interval type is one of:
In addition, we also have limited support for YEAR[S]
and MONTH[S]
.
These literals cannot be stored in columns and currently are only
supported for operations involving add and sub.
Object Literal¶
Syntax:
Where each ki
is a unique string literal, and each vi
is an expression.
Obeys the same semantics as the function OBJECT_CONSTRUCT
, so any pair
where the key or value is null is omitted, and for now BodoSQL only supports
when all values are the same type.
String Literal¶
Syntax:
Where char is a character literal in a Python string.