Direct materialization from Estuary to ClickHouse
Estuary provides a direct materialization connector with ClickHouse that uses ClickHouse's native protocol and native format.
This allows Estuary to:
- Materialize data to both self-hosted and ClickHouse Cloud instances
- Automatically handle tasks like table creation and schema evolution
- Support soft or hard deletes
- Use
ReplacingMergeTreefor standard merge updates orMergeTreefor delta updates - Provide exactly-once delivery
See Estuary's Kafka ClickPipe integration for a ClickPipes workflow.
Setup guide
Prerequisites
You will need:
- An Estuary account
- One or more captures in Estuary that pull data from your desired sources
- A ClickHouse instance, self-hosted or Cloud account
- A ClickHouse database user with credentials
Configure ClickHouse for integration
To set up Estuary's ClickHouse connector, you will need to gather some information from your ClickHouse instance and configure user permissions.
-
Copy your database's host endpoint.
For the port, use 9440 if TLS is enabled or 9000 if TLS is disabled.
Together, the host and port will form the address you need to provide to Estuary.
-
Grant permissions to the database user that Estuary will access.
To automatically create and manage tables for you, Estuary will need
CREATE,SELECT,INSERT, etc permissions on your target database as well as permissions for metadata discovery and partition management.You can grant all required permissions by running these SQL commands, replacing
<database>and<user>with your own information: -
Optionally restrict user system access to only the target database.
You can do so with row-level policies. For example:
You can then move to Estuary to finish setup.
Create an Estuary materialization
-
In Estuary's dashboard, go to the Destinations page.
-
Click + New Materialization.
-
Select the ClickHouse connector.
-
Fill out the Materialization Details section.
- Provide a unique name for your materialization
- Choose a data plane (cloud provider and region)
-
Fill out Endpoint Config details so Estuary can connect to your ClickHouse instance.
- Address: the host and port of your instance
- Database: target database name
- Authentication: username and password for the database user
You can also configure optional settings, such as whether to use hard deletes and the SSL mode to use.
Configure source collections
Choose which sources you'd like to materialize into ClickHouse in the Source Collections section.
-
Link an existing capture or add individual data collections to materialize to ClickHouse.
-
Select a data collection from the list to configure further if necessary. Customization options include:
- Choose a different table name for the collection
- Select merge behavior for the collection (whether to use delta updates mode)
- Customize field selection behavior to control which fields are materialized
-
Once you're happy with how data will be materialized to ClickHouse, click Next and Save and Publish.
Estuary will start backfilling data from the selected collections to ClickHouse and then stream updates as they occur.
Additional resources
For more on setting up a ClickHouse connector with Estuary, see Estuary's documentation:
- Reference Estuary's ClickHouse materialization docs.
- Besides the UI-based workflow provided in these instructions, you can also manage pipeline setup with Estuary via CLI. See Estuary's guides on
flowctlfor more on working with Estuary programmatically.