Skip to content

Virtual Tables Guide

Virtual Tables are the interface for inputs (sources) and outputs (sinks) in SQLStreamBuilder. Sources and Sinks are configured separately, and may require a data source to be configured before they can be selectable. Data is continuously processed via the SQLStreamBuilder SQL job from the source and the results of the query are continuously processed to the sink.

A sink is optional, it may be preferable to output data to the screen when composing a SQL statement or configuring a Materialized View and thus not sending data to a sink directly. A Virtual Table Sink can be the same topic as a Virtual Table Source, thus chains of SQLStreamBuilder jobs can be built to process data in pieces.

Virtual Table Source

A Virtual Table Source is a logical definition of the data source including the location and connection parameters, a schema, an optional Input Transform, and any context specific configuration parameters required.

The Virtual Table Source is specified in the SQL being run as the FROM clause. If you are performing a join then each table specified in the SQL statement must reference a Virtual Table Source (it can be the same source multiple times, or a self-join).

For example:

SELECT
lat,
lon
FROM
airplanes -- the name of the virtual table source
WHERE
icao <> 0;

Creating a virtual table source

  • Select SQLStreamBuilder from the left hand menu.
  • Select the Virtual Tables tab.
  • Select the Virtual Table Source sub-tab.
  • Select Add Source->Apache Kafka.
  • Name the Virtual Table. This is the name use in the FROM clause.
  • Select Kafka Cluster, Topic Name, Data Format.
  • If you have pre-built schema or want to change the default schema change it by hand OR select Detect Schema at the bottom left to read a sample of the messages and automatically infer the schema. If there are no messages in the topic, then no schema will be inferred.
  • Configure an Input Transform, add the code using the Transformations tab (OPTIONAL).
  • Configure any Kafka properties required using the Properties tab (OPTIONAL).
  • Select Save Changes. This Virtual Table Source is ready for queries to be run. You can check it's configuration by simply issuing a DESC <tablename> in the Compose tab and selecting Execute.

Virtual Table Sink

Virtual Table Sinks are specified when running SQL (optionally). A sink can be of various data source types and the data source must be configured before creating a Virtual Table Sink on it. The results from the SQLStreamBuilder SQL query are sent to the Virtual Table Sink continuously.

There are a number of valid types:

  • Kafka
  • AWS S3
  • ElasticSearch
  • Google Cloud Storage (GCS)
  • Webhook

Kafka Sink

To create a Kafka sink:

  • Select SQLStreamBuilder from the left hand menu.
  • Select the Virtual Tables tab.
  • Select the Virtual Table Sink sub-tab.
  • Select Add Sink->Apache Kafka.
  • Name the Virtual Table. This is the name used in the Virtual Table Sink drop down in the Compose menu when running SQL.
  • Select Kafka Cluster and Topic Name.

Note: all output is JSON encoded data in the format {column>:<value>}.

A Materialized view is not a Virtual Table Sink and is configured using the Materialized View tab.

Webhook Sink

The webhook sink is useful for sending data to HTTP endpoints via POST or PUT. It can be configured to perform an HTTP action per message (default) or code created that controls the frequency (for instance, every N messages).

When developing webhook sinks, it's highly useful to use hookbin to develop and QA your webhook before pointing at your true destination.

To create a webhook sink:

  • Select SQLStreamBuilder from the left hand menu.
  • Select the Virtual Tables tab.
  • Select the Virtual Table Sink sub-tab.
  • Select Add Sink->Webhook.
  • Name the webhook.
  • Enter an HTTP EndPoint. It must start with http:// or https://. Using hookbin for initial testing is a great way to develop webhooks. Paste the hookbin endpoint into the text field, and inspect the output on the hookbin site. Once you are satisfied with the output, then point it at your final endpoint.
  • Give the webhook a Description.
  • Select POST or PUT in the HTTP Method select box.
  • Choose to Disable SSL Validation if needed. Hookbin works best disabled.
  • Choose Use Request Template. If yes, then the template defined in the Request Template tab will be used for output. This is useful if the service you are posting to requires a particular data output format.
  • In the Code editor, you can specify a code block that controls how the webhook fires. For a webhook that is called for each message the following code would be used:
// Boolean function that takes entire row from query as Json Object

function onCondition(rowAsJson){
   return true;   // return false here for no-op, or plug in custom logic
}
onCondition($p0)

NOTE: rowAsJson is the result of the ContinuousSQL query being run in {"name":"value"} format.

  • Optionally add headers via the HTTP Headers tab. Headers are name:value header elements. For instance, Content-Type:application/json, etc.
  • If Use Request Template is yes, then a request template must be specified. It must be valid JSON, and use "${columnname}" to represent fields. For example, a template for use with Pagerduty could look like this:
{
   "incident":{
      "type":"incident",
      "title":"${icao} is too high!",
        "body":{
   "type":"incident_body",
         "details":"Airplane with id ${icao} has reached an altitude of ${altitude} meters."
      }
    }
}
  • Select Save Changes.
  • Test and verify the webhook is working by selecting it as a Virtual Table Sink when running your query.