MongoDB Replication
40 minMongoDB replication provides high availability, data redundancy, and read scaling through replica sets. A replica set is a group of MongoDB instances that maintain the same data set. Replication ensures that if one node fails, another can take over, providing automatic failover and high availability. Understanding replication is essential for production MongoDB deployments requiring reliability and scalability.
Replica sets consist of primary and secondary nodes. The primary node receives all write operations and replicates data to secondary nodes. Secondary nodes replicate data from the primary and can serve read operations. If the primary fails, the replica set automatically elects a new primary from the secondary nodes. This automatic failover ensures high availability with minimal downtime.
Replication uses an oplog (operations log) to record all write operations. The oplog is a capped collection that stores operations in order. Secondary nodes continuously read from the primary's oplog and apply operations to their own data sets. This ensures all nodes maintain identical data. Understanding the oplog enables you to monitor replication lag and troubleshoot replication issues.
Read preferences control where read operations are directed. You can read from the primary (default), secondaries, or nearest node. Reading from secondaries distributes read load and improves performance for read-heavy workloads. However, secondary reads may return slightly stale data due to replication lag. Understanding read preferences enables you to balance consistency with performance.
Write concerns control when write operations are acknowledged. Write concerns can require acknowledgment from the primary only, a majority of nodes, or all nodes. Stronger write concerns provide better durability but may impact performance. Understanding write concerns enables you to balance durability with performance based on your application requirements.
Replica set configuration includes member priorities (which nodes are preferred for primary), voting members (which nodes participate in elections), and arbiter nodes (nodes that vote but don't store data). Understanding replica set configuration enables you to design deployments that meet your availability and performance requirements.
Key Concepts
- Replica sets provide high availability and data redundancy.
- Primary nodes handle writes, secondary nodes replicate data.
- Automatic failover ensures high availability.
- Read preferences control where reads are directed.
- Write concerns control write acknowledgment requirements.
Learning Objectives
Master
- Setting up and configuring replica sets
- Understanding primary and secondary node roles
- Configuring read preferences and write concerns
- Managing replica set members
Develop
- Understanding high availability architectures
- Designing reliable MongoDB deployments
- Implementing read scaling strategies
Tips
- Initialize replica set: rs.initiate({ _id: 'rs0', members: [...] }).
- Check replica set status: rs.status() to see member states.
- Add members: rs.add('host:port') to add new nodes.
- Use read preferences: db.collection.find().readPref('secondary') for read scaling.
Common Pitfalls
- Not having enough members, unable to elect new primary during failures.
- Not monitoring replication lag, serving stale data from secondaries.
- Not configuring write concerns appropriately, risking data loss.
- Not understanding automatic failover, causing confusion during primary changes.
Summary
- Replication provides high availability and data redundancy.
- Replica sets automatically handle failover and data synchronization.
- Read preferences and write concerns enable performance and durability tuning.
- Understanding replication enables reliable, scalable MongoDB deployments.
Exercise
Set up and manage MongoDB replica sets.
// Start replica set members
// Primary node
mongod --port 27017 --dbpath /data/rs0-0 --replSet rs0
// Secondary nodes
mongod --port 27018 --dbpath /data/rs0-1 --replSet rs0
mongod --port 27019 --dbpath /data/rs0-2 --replSet rs0
// Initialize replica set
rs.initiate({
_id: "rs0",
members: [
{ _id: 0, host: "localhost:27017" },
{ _id: 1, host: "localhost:27018" },
{ _id: 2, host: "localhost:27019" }
]
})
// Check replica set status
rs.status()
// Check replica set configuration
rs.conf()
// Add new member to replica set
rs.add("localhost:27020")
// Remove member from replica set
rs.remove("localhost:27020")
// Step down primary (for maintenance)
rs.stepDown()
// Force reconfiguration
rs.reconfig({
_id: "rs0",
members: [
{ _id: 0, host: "localhost:27017" },
{ _id: 1, host: "localhost:27018" },
{ _id: 2, host: "localhost:27019", priority: 0 }
]
})
// Read from secondary
db.users.find().readPref("secondary")
// Write with write concern
db.users.insertOne(
{ name: "New User" },
{ writeConcern: { w: "majority", j: true }}
)
// Check oplog
use local
db.oplog.rs.find().sort({ $natural: -1 }).limit(5)
Exercise Tips
- Use odd number of members (3, 5, 7) for proper voting in elections.
- Monitor replication lag: rs.printSlaveReplicationInfo() to check lag.
- Use write concern 'majority' for durability: { writeConcern: { w: 'majority' } }.
- Configure member priorities: { _id: 0, host: '...', priority: 2 } for preferred primary.