What is ZooKeeper?

Blog / What is ZooKeeper?

So why is ZooKeeper so widely used in production distributed systems?

TL;DR Summary

Apache ZooKeeper is an open-source, centralized coordination service designed to manage configuration information, naming, and synchronization across large-scale, distributed systems.

Don't let one question ruin your next technical interview...

Distributed Computing•27

Cloud Computing•40

Why is ZooKeeper used?

As distributed systems grow in complexity, coordinating tasks between multiple services and nodes becomes challenging. For example, if you have a cluster of microservices that need to agree on who leads the cluster (a “leader election” scenario) in a consistent way, ZooKeeper steps in to provide a simple, reliable interface.

Core Components1. ZNodes (Data Nodes)

ZooKeeper’s primary data units are called znodes, which form a tree-like structure similar to a file system’s directories and files. Each znode can hold small amounts of data, such as configuration information or state details needed by distributed applications. There are two key types of znodes:
- Persistent ZNodes: These remain intact until explicitly deleted, surviving client disconnects (e.g. an application instance) and server restarts.
- Ephemeral ZNodes: These exist only as long as the client session that created them is active. When the session ends (e.g., the client disconnects), the ephemeral znodes automatically disappear. This relationship ensures that transient state information is automatically cleaned up if a client fails or loses its connection, preventing stale data from lingering in the system.

2. Watches

Clients can set “watches” on znodes, essentially registering interest in changes. When data changes, ZooKeeper notifies the clients that have watches set, enabling responsive, dynamic configurations.

2. Leaders and Followers (ZooKeeper Ensemble)

ZooKeeper typically runs as a cluster, known as an ensemble, consisting of multiple servers. Among these:
- Leader: One server acts as the Leader, handling all write requests. By funneling all writes through a single leader, ZooKeeper ensures that updates are orderly, consistent, and serialized.
- Followers: Other servers are Followers. They serve read requests directly from their locally maintained copy of the znode tree, providing low-latency reads. They also participate in consensus protocols led by the Leader to replicate writes and maintain a coherent, unified state.

How it Works

Write Operations: Clients send write requests (like creating or updating a znode) to any server. If that server is a Follower, it forwards the request to the Leader. The Leader then coordinates with Followers to reach consensus. Once a quorum confirms the update, the znode data in every server’s memory is updated, guaranteeing that the change is uniformly visible.
Read Operations: Clients can read directly from any server’s in-memory znode tree because all servers share the same synchronized state. This architectural choice enables fast reads and effortless horizontal scaling by adding more servers to handle increased client load.

Benefits

Consistency: It ensures that all nodes see a single, agreed-upon source of truth.
Fault-Tolerance: Even if some ZooKeeper nodes fail, the service remains available.
High Performance: It’s optimized for reads, meaning distributed services can quickly fetch essential data.
Simplicity: Its data model and interface are straightforward, encouraging best practices for coordination.

Simple Implementation

python

from kazoo.client import KazooClient
  
# Step 1: Connect to the ZooKeeper ensemble
# Here 'localhost:2181' is the ZooKeeper server address.
zk = KazooClient(hosts='localhost:2181')
zk.start()  # Establish a session with ZooKeeper
  
# Step 2: Create a znode (if it doesn’t exist) to store configuration data
config_path = "/my_app/config"
if not zk.exists(config_path):
    # Create a persistent znode with some initial data
    zk.create(config_path, b"initial_config_data")
  
# Step 3: Read the data from the znode
data, stat = zk.get(config_path)
print("Current config data:", data.decode('utf-8'))
# This prints out the data stored at /my_app/config
  
# Step 4: Set a watch to get notified when data changes
@zk.DataWatch(config_path)
def watch_node(data, stat):
    if data:
        print("Znode changed! New data:", data.decode('utf-8'))
    else:
        print("Znode deleted!")
  
# Step 5: Update the znode’s data
zk.set(config_path, b"updated_config_data")
# The watch_node function will trigger and print the updated data
  
# Step 6: Close the connection
zk.stop()

Use Cases

Configuration Management: Centralize configuration data for microservices.
Service Discovery: Identify available services dynamically. If a service starts or stops, ZooKeeper updates its znode, allowing clients to see the current state of the system.
Leader Election: For systems that need a single “master” node—like a master database, or a scheduler—ZooKeeper provides recipes to safely elect one leader from multiple candidates.
Synchronization and Distributed Locks: When multiple clients need to coordinate tasks and avoid conflicts, ZooKeeper can implement a locking mechanism. For example, if multiple clients want to update a shared resource, ZooKeeper ensures only one can do so at a time.

Real-World Examples

Large-scale distributed frameworks like Apache Kafka, Apache HBase, and Apache Hadoop use ZooKeeper under the hood.
Beyond these well-known open-source projects, numerous leading companies including Netflix, LinkedIn and Pinterest leverage ZooKeeper to power their critical infrastructure.

About TechPrep

TechPrep has helped thousands of engineers land their dream jobs in Big Tech and beyond. Covering 60+ topics, including coding and DSA challenges, system design write-ups, and interactive quizzes, TechPrep saves you time, build your confidence, and make technical interviews a breeze.