CREATE EXTERNAL STREAM
External stream for Kafka is official supported. The external stream for local log files is at technical preview. In Timeplus Enterprise, it also supports another type of External Stream to read/write data for a remote Timeplus Enterprise.
Kafka External Stream
CREATE EXTERNAL STREAM [IF NOT EXISTS] stream_name (<col_name1> <col_type>)
SETTINGS type='kafka',
brokers='ip:9092',
topic='..',
security_protocol='..',
username='..',
password='..',
sasl_mechanism='..',
data_format='..',
kafka_schema_registry_url='..',
kafka_schema_registry_credentials='..',
ssl_ca_cert_file='..',
ss_ca_pem='..',
skip_ssl_cert_check=..
The supported values for security_protocol
are:
- PLAINTEXT: when this option is omitted, this is the default value.
- SASL_SSL: when this value is set, username and password should be specified.
- If you need to specify own SSL certification file, add another setting
ssl_ca_cert_file='/ssl/ca.pem'
New in Proton 1.5.5, you can also put the full content of the pem file as a string in thessl_ca_pem
setting if you don't want to, or cannot use a file path, such as on Timeplus Cloud or in Docker/Kubernetes environments. - Skipping the SSL certification verification can be done via
SETTINGS skip_ssl_cert_check=true
.
- If you need to specify own SSL certification file, add another setting
The supported values for sasl_mechanism
are:
- PLAIN: when you set security_protocol to SASL_SSL, this is the default value for sasl_mechanism.
- SCRAM-SHA-256
- SCRAM-SHA-512
The supported values for data_format
are:
- JSONEachRow: each Kafka message can be a single JSON document, or each row is a JSON document. Learn More.
- CSV: less commonly used. Learn More.
- ProtobufSingle: for single Protobuf message per Kafka message
- Protobuf: there could be multiple Protobuf messages in a single Kafka message.
- Avro: added in Proton 1.5.2
- RawBLOB: the default value. Read/write Kafka message as plain text.
For examples to connect to various Kafka API compatitable message platforms, please check this doc.
Timeplus External Stream
CREATE EXTERNAL STREAM [IF NOT EXISTS] stream_name (<col_name1> <col_type>)
SETTINGS
type = 'timeplus',
hosts = '<ip_or_host_of_timeplusd>',
db = 'default',
user = '<user>',
password = '<password>',
secure = <bool>,
stream = '<stream_name>'
Settings:
- hosts: the IP or host for the remote timeplusd. When you set a set of hosts with ports, e.g. 'host1:port1,host2:port2', this will treat each host as a shard.
hosts
is required and there is no default value. - db: the database name in the remote Timeplusd. The default value is 'default'.
- user: the user name for the remote Timeplusd. The default value is 'default'.
- password: the password for the remote Timeplusd. The default value is an empty string.
- secure: a bool for whether to use secure connection to the remote Timeplusd. The default value is false. Use port 9440 when
secure
is set to true, otherwise use port 8463. - stream: the stream name in the remote Timeplusd. It's required and there is no default value.
Examples
Migrate data from Timeplus Proton to Timeplus Enterprise
If you have deployed Timeplus Proton and want to load those data to a Timeplus Enterprise deployment, you cannot upgrade in place. You can use the Timeplus External Stream to migrate data.
The Timeplus Proton need to be 1.5.15 or above.
For example, there is a stream streamA
in Timeplus Proton, running on host1.
In your Timeplus Enterprise, you can create the stream with the same name and same schema. Then use a materialized view to load all data from Timeplus Proton to Timeplus Enterprise.
CREATE STREAM streamA(..);
CREATE EXTERNAL STREAM streama_proton
SETTINGS type='timeplus',hosts='host1',stream='streamA';
CREATE MATERIALIZED VIEW proton_to_tp_enterprise INTO streamA
AS SELECT * FROM streama_proton WHERE _tp_time>earliest_ts();
When all data in Proton has been imported into Timeplus Enterprise, you can drop the materialized view.
Upload data from edge server to the cloud
If you deploy Timeplus Proton or Timeplus Enterprise at edge servers, it can collect and process live data with high performance and low footprint. The important data can be uploaded to the other Timeplus Enterprise in the cloud when the internet is available.
For example, on the edge server, you collect the real-time web access log and only want to upload error logs to the server.
CREATE EXTERNAL STREAM stream_in_cloud
SETTINGS type='timeplus',hosts='cloud1',stream='..';
CREATE MATERIALIZED VIEW edge_to_cloud INTO stream_in_cloud
AS SELECT * FROM local_stream WHERE http_code>=400;
When the network is not available, you can pause the materialized view by:
SYSTEM PAUSE MATERIALIZED VIEW edge_to_cloud;
When the network is restored, you can resume it:
SYSTEM UNPAUSE MATERIALIZED VIEW edge_to_cloud;
Limitations
This is a relatively new feature. There are some known limitations which we plan to improve later on.
- table function is not supported. In order to query all or part of the historical data, you can start a streaming query with
WHERE _tp_time>earliest_ts()
orWHERE _tp_time>now()-2d
. - window functions like tumble/hop are not working yet.
- can't read virtual columns on remote streams.