MongoDB sharding
bogotobogo.com site search:
MongoDB 3.0
Why sharding?
"Sharding, or horizontal scaling, divides the data set and distributes the data over multiple servers, or shards. Each shard is an independent database, and collectively, the shards make up a single logical database." - Sharding Introduction
Here are the reasons we may want to do sharding (from MongoDB - Sharding).
- In replication all writes go to master node.
- Latency sensitive queries still go to master.
- Single replica set has limitation of 12 nodes.
- Memory can't be large enough when active dataset is big.
- Local Disk is not big enough.
- Vertical scaling is too expensive.
- Sharding reduces the number of operations each shard handles. Each shard processes fewer operations as the cluster grows. As a result, a cluster can increase capacity and throughput horizontally.
Terminology
- mongos (MongoDB Shard) is a routing service for MongoDB shard configurations that processes queries from the application layer, and determines the location of this data in the sharded cluster, in order to complete these operations.
- sharded cluster consists of three config processes, one or more replica sets, and one or more mongos routing processes.
- shard is a mongod instance or replica set.
MongoDB 3.0
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization