Home >>Distributed DBMS Tutorial >DDBMS - Controlling Concurrency
Concurrency controlling techniques ensure that several transactions are executed concurrently while preserving the transactions' ACID properties and schedule serializability.
In this chapter, we will discuss the various approaches to controlling concurrency.
The concept of locking data items is used by Locking-based concurrency control protocols. A lock is a variable associated with a data item that determines whether it is possible to perform read/write operations on that data item. Generally, a lock compatibility matrix is used which determines whether two transactions at the same time can lock a data item.
Either one-phase or two-phase locking protocols can be used by locking-based concurrency control systems.
One-phase Locking Protocol
In this method, before use, each transaction locks an item and releases the lock as soon as it has completed its use. This locking approach allows for optimum concurrency, but serializability is not always enforced.
Two-phase Locking Protocol
All locking operations in this procedure precede the first lock-release or unlock operation. This transaction consists of two steps. In the first step, only all the locks it needs are acquired by a transaction and no lock is issued. This is called the expanding phase or the rising phase. The transaction unlocks the locks in the second process and unable to order any new locks. The shrinking process is called this.
Serialization is guaranteed for any transaction that follows the two-phase locking protocol. This approach, however, offers little parallelism between two transactions that conflict.
Timestamp-based algorithms for concurrency control use the timestamp of a transaction to coordinate simultaneous access to a data item to ensure serializability. A timestamp is a unique DBMS identifier given to a transaction that represents the start time of the transaction.
Such algorithms ensure that transactions are committed in the order that their timestamps determine. Before a younger transaction, an older transaction should be committed, since the older transaction joins the system before the younger one.
Serializable schedules are created by timestamp-based concurrency control techniques so that the equivalent serial schedule is organized in order of the age of the participating transactions.
Some of the algorithms for concurrency control based on timestamps are
Three guidelines for applying serializability are accompanied by timestamp-based ordering-
The task of validating every transaction for serializability may lower performance in systems with low conflict rates. In these examples, the serializability test is postponed to just before committing. Since the rate of dispute is low, the risk of aborting non-serializable transactions is also low. This approach is called the optimistic technique of concurrency control.
In this method, the life cycle of a transaction is split into the following 3 phases:
Execution Phase- A transaction fetches and performs operations on data items in memory.
Validation Phase- A transaction performs checks to ensure that serializability checking is passed by committing its changes to the database.
Commit Phase−A transaction writes the changed data item back to the disks in memory.
This algorithm utilizes three rules in the validation process to implement serializability.
Rule 1 − Given two transactions Ti and Tj, if Ti is reading the data item which Tj is writing, then Ti’s execution phase cannot overlap with Tj’s commit phase. Tj can commit only after Ti has finished execution.
Rule 2 − Given two transactions Ti and Tj, if Ti is writing the data item that Tj is reading, then Ti’s commit phase cannot overlap with Tj’s execution phase. Tj can start executing only after Ti has already committed.
Rule 3 − Given two transactions Ti and Tj, if Ti is writing the data item which Tj is also writing, then Ti’s commit phase cannot overlap with Tj’s commit phase. Tj can start to commit only after Ti has already committed
In this section, we will see how the above techniques are implemented in a distributed database system.
Distributed Two-phase Locking Algorithm
Like the basic two-phase locking protocol, the basic concept of distributed two-phase locking is the same. There are, however, sites designated as lock managers in a distributed system. A lock manager manages requests from transaction monitors for lock acquisition. In order to enforce coordination between lock managers at different sites, the control to see all transactions and detect lock disputes is granted to at least one site.
Distributed two-phase locking methods can be of three kinds, depending on the number of sites that can detect lock conflicts.
Distributed Timestamp Concurrency Control
The time stamp of any transaction in a centralized system is calculated by the reading of the physical clock. But, in a distributed system, the local physical / logical clock readings of any site should not be used as global timestamps because they are not special globally. So, a timestamp consists of a combination of the site ID and the clock reading of that site.
Each site has a scheduler that maintains a separate queue for each transaction manager for the implementation of timestamp ordering algorithms. A transaction manager sends a lock request to the site's scheduler during the transaction. The scheduler sends the request in increasing timestamp order to the corresponding queue. Requests are sorted in the order of their timestamps, i.e. the oldest first, from the front of the queues.
Conflict Graphs
The creation of conflict graphs is another method. Classes are specified for this transaction. Two sets of data items called read set and write set are included in a transaction class. If the transaction's read set is a subset of the class 'read set, a transaction belongs to a specific class, and the transaction's write set is a subset of the class' write set. Each transaction issues its read requests in the reading process for the data items in its read set. Each transaction issues its writing requests in the writing process.
For the classes to which active transactions belong, a conflict graph is generated. It requires a set of edges that are vertical, horizontal, and diagonal. Inside a class, a vertical edge connects two nodes and denotes conflicts within the class. Two nodes between the two classes are connected by a horizontal edge and denote a write-write dispute between various classes. Two nodes between two classes are connected by a diagonal edge and a write-read or a read-write conflict between two classes is represented.
The conflict graphs are analyzed to decide whether it is possible to run two transactions in parallel within the same class or between two different classes.
Distributed Optimistic Concurrency Control Algorithm
The distributed algorithm for optimistic concurrency control extends the algorithm for optimistic concurrency control. Two laws are applicable to this extension: