top of page
Programming

What is the CAP theorem? Is the CAP theorem still valid?

Updated: Feb 14

You can listen to the audio of this blog here


Let's dive right in -

  1. What is the CAP theorem?

  2. Consistency

  3. Availability

  4. Partition Tolerance

  5. Where is the CAP theorem used in the real world?

  6. Is the CAP theorem valid in today's world?

  7. A Brief History of the CAP Theorem

  8. Bonus: CAP Theorem and Latency


What is the CAP theorem?

CAP theorem (also known as Brewer’s theorem) is a theory that was formulated by Eric Brewer in 2000. The CAP theorem states that a distributed database system cannot simultaneously provide consistency, availability, and partition tolerance. A distributed database system can only guarantee 2 out of these 3 principles.


Consistency

Consistency implies that the data stored across all the databases and storage units are the same. This means that if I were to read from any of the data sources, I would get the latest and updated information. A system that follows this principle is called a consistent system.


Availability

Availability indicates that the system should be available 100% of the time. There should be no downtime in the system and no request should throw an exception or an error. It is important to note that an available system can still return a result that is not the latest version of the data.


Partition Tolerance

Partition refers to the situation when the channel of communication between two connected systems is broken in a distributed system architecture. Partition Tolerance means that the system will continue to work even if there is a partition and a few messages are dropped or there is a delay in message communications across each system.



Where is the CAP theorem used in the real world?

In real-world scenarios, "Partition Tolerance" is a property that cannot be avoided as most of the real-world scenarios use a distributed data storage unit and are bound to have partitions, so you would want your system to be tolerant to such message outages.


Now we are left to choose between Consistency and Availability. Yes, as per the CAP theorem we can choose among these two combinations:

  • CP (Consistency and Partition Tolerance)

  • AP (Availability and Partition Tolerance)


As you start to scale, the problems of consistency and availability start becoming a problem as partitioning data sources become a necessity.


If you want to know more about System Scaling, check out this blog on Horizontal vs Vertical Scaling: https://www.thegeekyminds.com/post/all-about-scaling-a-system-horizontal-scaling-vs-vertical-scaling-system-design

If a NoSQL database is partitioned, it can either be CP (Consistency and Partition Tolerance) or AP (Availability and Partition Tolerance). If a NoSQL database is CA (Consistency and Availability), it means it has not been partitioned and is monolithic.


CP (Consistency and Partition Tolerance) with NoSQL


MongoDB is an example of a NoSQL database that follows CP (Consistency and Partition Tolerance). MongoDB stores data in several nodes. Every few seconds the primary node sends a heartbeat ping to the other nodes to check that it is alive. If any node does not respond to this signal, it is retired and cannot be accessed.