Building a data warehouse from scratch is a challenging task. You have to build a solid ETL pipeline, provide high-availability databases, and API endpoints, including batch processing and rate limiters.
Data warehouses, by definition, are like dams. Several applications might depend on those warehouses, and any misfortune can cause disaster. I am not going any deeper into what data warehouses are like or how to build one. In this article, I am going to explain how you can scale your ClickHouse StatefulSets, which are the heart of the warehouse, on your Kubernetes cluster with sharding and replication.
What do Replication and Sharding Mean?
Replication involves duplicating data across multiple servers by ensuring high availability and fault tolerance. On the other hand, sharding distributes large datasets across several servers to manage large volumes of data and handle high throughput operations.
That sounds awesome, and it is exactly what we need for our data warehouse. Building a database that provides sharding and replication is not our concern for now. Most of the known database engines have sharding and replication support, for instance, Postgres Patroni Cluster. You should check their official docs about scaling. In our case, we are going to follow ClickHouse's official documentation for scaling. However, this guide is for a Dockerized setup. We apply these steps to a Kubernetes cluster.
All set. Let's start.
You can refer to the YAML files in this repository: ahmethedev/ch-2R2S-cluster for a ready-to-use ClickHouse cluster configuration with 2 shards and 2 replicas.
1. Setup ZooKeeper
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications.
ZooKeeper is a crucial service for our case. We are planning a 4-pod system design for scaling ClickHouse as 2 pods for replication and 2 pods for sharding. Replication and sharding require high attention for data management. ClickHouse uses ZooKeeper to save metadata about the cluster, such as which node has which table, which data it has handled, and which has not yet, etc. Also, ZooKeeper plays a critical role in DDL tasks such as CREATE or UPDATE. If you run a DDL command in the clickhouse-0 node, the ZooKeeper task queue will execute the same DDL in clickhouse-1 and clickhouse-2 nodes synchronously. It also helps with the leader election for the cluster. As you can see, we use ZooKeeper for all distributed system requirements.
How to Set Up ZooKeeper on Your Kubernetes Cluster?
Create a namespace for ZooKeeper:
kubectl create namespace zookeeper
Apply all ZooKeeper configuration files:
kubectl apply -f zookeeper/
Wait until the first pod is ready:
kubectl get pods -n zookeeper -w
Test the first pod:
kubectl exec -n zookeeper zookeeper-0 -- bash -c "echo ruok | nc localhost 2181" # Expected output: imok
Scale to multiple pods:
# Add the second pod kubectl scale statefulset zookeeper -n zookeeper --replicas=2 # Add the third after the second pod is ready kubectl scale statefulset zookeeper -n zookeeper --replicas=3
Verify cluster status:
kubectl exec -n zookeeper zookeeper-0 -- bash -c "echo stat | nc localhost 2181" # Expected zkServer.sh status output: # zookeeper-0: Mode: follower (or leader) # zookeeper-1: Mode: leader (or follower) # zookeeper-2: Mode: follower
We are all set for ZooKeeper.
2. Set Up ClickHouse with Sharding and Replication
As I mentioned earlier, we are not going to write a database engine from scratch for scaling. ClickHouse already has this capability. We just need to configure it to make it ready.
Apply all ClickHouse configuration files:
kubectl apply -f clickhouse/
Wait for the first pod to be ready, then scale to 4 pods:
# Scale gradually to ensure proper initialization kubectl scale statefulset clickhouse-cluster --replicas=2 kubectl wait --for=condition=ready pod clickhouse-cluster-1 --timeout=300s kubectl scale statefulset clickhouse-cluster --replicas=3 kubectl wait --for=condition=ready pod clickhouse-cluster-2 --timeout=300s kubectl scale statefulset clickhouse-cluster --replicas=4 kubectl wait --for=condition=ready pod clickhouse-cluster-3 --timeout=300s
All pods need to be running. If you get errors, you need to check your ConfigMap or StatefulSet files.
We are done with the ClickHouse setup. Now we should have a ClickHouse cluster with 2 replicas and 2 shards.
How to Verify the Cluster?
Connect to the ClickHouse pod:
kubectl exec -it clickhouse-cluster-0 -- clickhouse-client
Check the cluster topology:
SELECT cluster, shard_num, replica_num, host_name, port FROM system.clusters;
Check the ZooKeeper connection:
SELECT * FROM system.zookeeper WHERE path IN ('/', '/clickhouse');
Test distributed table creation:
Troubleshooting Common Issues
ZooKeeper Connection Issues
# Check ZooKeeper logs kubectl logs -n zookeeper zookeeper-0 # Test ZooKeeper connectivity kubectl exec -n zookeeper zookeeper-0 -- zkCli.sh -server localhost:2181 ls /
ClickHouse Cluster Issues
# Check ClickHouse logs kubectl logs clickhouse-cluster-0 # Verify cluster configuration kubectl exec -it clickhouse-cluster-0 -- clickhouse-client --query "SELECT * FROM system.clusters"
File Structure
Your project structure should look like this:
clickhouse-k8s/ ├── README.md ├── zookeeper/ │ ├── 01-zookeeper-configmap.yaml │ ├── 02-zookeeper-headless.yaml │ └── 03-zookeeper-deployment.yaml └── clickhouse/ ├── 01-clickhouse-configmap.yaml ├── 02-clickhouse-headless.yaml ├── 03-clickhouse-deployment.yaml └── 04-clickhouse-service.yaml
Conclusion
Congratulations! You have successfully set up a distributed ClickHouse cluster with 2 replicas and 2 shards on Kubernetes. This setup provides both high availability through replication and horizontal scalability through sharding.