top of page

Database Sharding vs Partitioning - What are the differences

Updated: Feb 14

You can listen to the audio of this blog here

Let's dive right in -

  1. Database Sharding vs Partitioning

  2. Pros and Cons of Database Sharding

  3. The Pros of Database Sharding

  4. The Cons of Database Sharding

  5. Pros and Cons of Partitioning

  6. Pros of Partitioning

  7. Drawbacks of Partitioning

Database Sharding vs Partitioning

Partitioning literally means the process of dividing something into 2 or more pieces.

Imagine that you purchased a cake. You have a container to store this cake. But there's a problem. The container is small and the entire cake can't accommodate in the container, so you purchase another container and decide to split the cake into both these containers.

As a result, you cut the cake into 9 pieces. You put 5 pieces of cake in container 1 and the remaining pieces in container 2. You may wonder - "So Gourav, what has this to do with sharding and partitioning"

Well, in terms of the database, each piece of the cake is a partition and the containers are called shards. The process of creating partitions is called partitioning and the process of creating shards is called sharding.

Let me elaborate. Suppose you own a company and use a SQL table to store all the employees' data. This SQL table sits on a server. Now imagine that your company hired a lot of employees. So many, that the SQL database server cannot handle the load of operations performed on the database.

Clearly, it's time to scale up your database. Sharding is the way to do it. You purchase another SQL server. Now you have a total of 2 shards. The big question is how do you divide the data into these 2 shards (SQL servers)? For this, you would need to devise a strategy to partition your data. The simplest strategy, off the top of my head, is to partition based on the name initials. By this, I mean all the employees whose name starts with the alphabet 'A' can be in one partition. Similarly, all the employees, whose name starts with 'B' can be another partition, and so on. Since there are 26 alphabets, we would have 26 partitions. We can place the first 13 partitions in shard 1 and the next 13 partitions in shard 2.

It may be clear that a shard can have multiple partitions in it. But a partition can reside in only one shard. Also if a database is partitioned, it does not imply that the database is definitely sharded. But if a database is sharded, it implies that the database has definitely been partitioned.

To sum it up

  • The process of dividing data in a database into logical parts is called partitioning.

  • Sharding is a database server part