Artigo: CAP Theorem on distributed systems



Consistência:
- [Example 1]: A single MySQL database instance, automatically is fully
consistent since there is only one node keeping the state. Guarants that
once you store a state (lets say “x=y”), it will report the
same state in every subsequent operation until the state is explicitly
changed by something outside the system.
- [Example 2]: Two MySQL servers keeping diferentt range of data, can still
easily guarantee consistency.
- [Example 3]: Two MySQL servers keeping the same data (master-master
replicas). If one of the database accepts get a “row insert”
request, that information has to be committed to the second system before
the operation is considered complete. To require 100% consistency in such a
replicated environment, communication between nodes is paramount. The over
all performance of such a system could drop as number of replica’s
required goes up.
Disponibilidade:
- [Example 1]: not highly available, if the node goes down there would 100%
data loss.
- [Example 2]: not highly available, if one node goes down, you will have
50% data loss.
- [Example 3]: highly available, a simple MySQL server replication [
Multi-master mode] setup could provide 100% availability. Increasing the
number of nodes with copies of the data directly increases the availability
of the system. Availability is not just a protection from hardware failure.
Replicas also help in loadbalancing concurrent operations, especially the
read operations. “Slave” MySQL instances are a perfect example
of such “replicas”.
Partition-tolerance:
- Example 3 - in two different datacenters]: lost the network connectivity
between them makes both databases incapable of synchronizing state between
the two. Would the two DBs be fully functional in such a scenario ? If you
somehow do manage allow read/write operations on these two databases, it
can be proved that the two servers won’t be consistent anymore.
- CAP rules don’t have to be applied in an “all or
nothing” fashion.
- Different systems can choose various levels of consistency, availability
or partition tolerance to meet their business objective.
- Increasing the number of replicas, for example, increases high
availability but it could at the same time reduce partition tolerance or
consistency.
- Google BigTable and Hadoop HBase: consistency and high availability.
- Amazon’s Dynamo in Cassandra sacrifice consistency in favor of
availability and partition tolerance.
- CAP theorem doesn’t just apply to databases. Even simple web
applications which store state in session objects have this problem.
- Recognizing which of the “C.A.P.” rules your business really
needs should be the first step in building any successful distributed,
scalable, highly-available system.
- Fontes:
- CAP Theorem on distributed systems, www.royans.net