SQLStreamBuilder SQL Syntax Reference
SQLStreamBuilder utilizes Apache Calcite compatible streaming SQL. Beyond that there are metadata commands, and some built-ins.
Some supported operations are:
FOO || 'baz'
- Tumble, Hopping window expressions
- Filter after aggregation
- Logical operators
- Arithmetic operators and functions
- Character string operators and functions
- Binary string operators and functions
- Date/Time functions
- Type conversion
- Aggregate functions
- Grouped window functions
- Grouped Auxiliary functions
and a list of data types:
show tables- list virtual tables.
desc <vtable>- describe the specified virtual table, showing columns and types.
show jobs- list current running SQL jobs.
show history- show SQL query history (only successfully parsed/executed).
help- show help.
Joins are considered "hyperjoins" because SQLStreamBuilder has the ability to join multiple virtual tables in a single query, and because a virtual table is created from a data provider, these joins can span multiple clusters/connect strings, but also multiple types of sources (join Kafka and a database for instance). So something like this is possible:
SELECT us_west.user_score+ap_south.user_score FROM kafka_in_zone_us_west us_west FULL OUTER JOIN kafka_in_zone_ap_south ap_south ON us_west.user_id = ap_south.user_id;
Time attributes define how streams are processed in time. There are two high-level options for providing time attributes to your SQLStreamBuilder queries.
Source-provided timestamps are injected directly into the data stream by the source connector. For example, when using a Kafka data source, the timestamp extracted from the Kafka message header will be embedded in the data stream by default, and exposed via a magic "eventTimestamp" column. For example, this query will use the built-in "eventTimestamp" column to window the query based on the timestamp recorded in the source Kafka topic.
SELECT flight_number, flight_origin, flight_destination, TUMBLE_END("eventTimestamp", INTERVAL '5' MINUTE) AS window_end_timestamp FROM airplane_flights GROUP BY TUMBLE("eventTimestamp", INTERVAL '5' MINUTE)
This query will use the source-provided timestamp from two virtual tables to perform a streaming join on multiple Kafka topics:
SELECT a.web_order_id, a.product_name, a.order_date b.next_shipment_time FROM online_orders a, shipment_events b WHERE a.shipping_type = 'Priority' AND a.eventTimestamp BETWEEN b.eventTimestamp - INTERVAL '1' HOUR AND b.eventTimestamp
The user can also specify timestamps contained in the data stream itself.
- The timestamp should be contained in a column of type "long".
- The timestamp should be in epoch format (eg. milliseconds since Jan 1, 1970).
For example, if your schema includes a field called "timestamp_ms", it's possible to construct a query such as:
SELECT * FROM airplane_flights a WHERE flight_number IS NOT NULL GROUP BY HOP(a."timestamp_ms", INTERVAL '15' SECOND, INTERVAL '5' SECOND), flight_number
Note that if your timestamp column is called "timestamp", it's required that you enclose the column name in quotation marks (because it is a reserved word - otherwise they are optional).
If an invalid timestamp is encountered in the stream (eg. NaN), then that message's timestamp defaults to 0, causing the message to be excluded from the current window.
If your data does not include a timestamp in a suitable format, it's possible to compute a new timestamp column from another existing column (using input transformations). See the documentation for Input Transformations for more information.
Built In Functions
There are some built in convenience functions:
-- convert EPOCH time to timestamp select EPOCH_TO_TIMESTAMP(1593718981) from ev_sample_fraud; -- convert EPOCH milliseconds to timestamp select EPOCHMILLIS_TO_TIMESTAMP(1593718838150) from ev_sample_fraud;
Escaping and quoting
Typical escaping and quoting is supported. For example:
- Nested columns:
SELECT foo.`bar` FROM table; -- must quote nested column
SELECT "some string literal" FROM mytable; -- a literal
More examples on SQL identifiers and quoting.