CONCURRENCY CONTROL The Basics Database: a fixed set of named resources (e.g. tuples, pages, files,...) Consistency Constraints: must be true for DB to be considered consistent. Transaction: a sequence of actions bracketed by begin and end statements. Each transaction ("xact") is assumed to maintain consistency. The ACID test Atomicity: Either all actions within a transaction occur, or none do Consistency: Each transaction takes the DB from one consistent state to another. Isolation: Events within a transaction must be invisible to other transactions. Durability: Results of committed transactions must be preserved even in the case of failures. Concurency Control provides Isolation The system "understands" only reads and writes; cannot assume any semantics of other operations. Arbitrary interleaving can lead to: * temporary inconsistencies (ok, unavoidable) * "permanent" inconsistencies, that is, inconsistencies that remain after transactions have completed. Some Definitions Schedule: A "history" or "audit trail" of all actions in the system, the xacts that performed them, and the objects they affected. Serial Schedule: A schedule is serial if all actions of each single xact appear together. Serializability: A schedule S is serializable if there exists a serial schedule S' such that S and S' are computationally equivalent. Equivalence of schedules Two schedules S1, S2 are considered computationally equivalent if: Same set of transactions participate in S1 & S2. For each data item Q in S1, if transaction Ti executes read(Q) and the value of Q read by Ti was written by Tj, then the same will hold in S2. [reads are all the same] For each data item Q in S1, if transaction Ti executes the last write(Q) instruction, then the same holds in S2. [the same writers "win"] Dependencies T1 reads N ... T2 writes N: a RW dependency T1 writes N ... T2 reads N: a WR dependency T1 writes N ... T2 writes N: a WW dependency Serialization graph Nodes are transactions T1, ..., Tn Edge Ti -> Tj if there is a RW, WR, or WW dependency from Ti to Tj Theorem: A schedule S is serializable iff SG(S) is acyclic. Locking A technique to ensure serializability, but hopefully preserve high concurrency as well. The winner in industry. A "lock manager" records what entities are locked, by whom, and in what "mode". Also maintains wait queues. A well-formed transaction locks entities before using them, and unlocks them some time later. Two-Phase Locking (2PL) Growing Phase: A transaction may obtain locks but not release any lock. Shrinking Phase: A transaction may release locks, but not obtain any new lock. (in fact, locks are usually all released at once to avoid "cascading aborts"). Theorem: If all xacts are well-formed and follow 2PL, then any resulting schedule is serializable Lock Granularity Granularity tradeoff: small granularity (e.g. field of a tuple) means high concurrency but high overhead. Large granularity (e.g. file) means low overhead but low concurrency. Possible granularities: * DB * Files * Pages * Tuples (records) * Fields of tuples Hierarchical locking Allow "large" xacts to set large locks, "small" xacts to set small locks Problem: T1 S-locks a record in a file, then T2 X-locks the whole file. How can T2 discover that T1 has locked the record? Solution: "Intention" locks T1 obtains S lock on record in question, but first gets IS lock on file. Now T2 cannot get X lock on file. But T3 can get IS or S. Typical Lock Implementations Maintain a lock table as hashed main-mem structure Lock/unlock must be atomic operations (protected by critical section) Suppose T1 has an S lock on P, T2 is waiting to get X lock on P, and now T3 wants S lock on P. Do we grant T3 an S lock? No! (starvation, unfair, etc.) Lock Queue Manage FCFS queue for each locked object with outstanding requests All xacts that are adjacent and compatible are a compatible group The front group is the granted group Group mode is most restrictive mode amongst group members Conversions Often want to convert (e.g. S to X for "test and modify" actions). Should conversions go to back of queue? No! Instant deadlock. So put conversions right after granted group. Degrees of Consistency Serializability is degree 3 consistency. Often live with less consistency for performance reasons. Degree 2 consistency: ignore R®W conflicts. Degree 1 consistency: ignore W®R conflicts also. Degree 0 consistency: still well-formed w.r.t. writes. Deadlock In OS world, deadlock usually due to errors or overloads In DB/xact world with 2PL, inherent. Most common causes: * Differing access orders T1: X-lock P T2: X-lock Q T1: X-lock Q // block waiting for T2 T2: X-lock P // block waiting for T1 * Lock-mode upgrades T1: S-lock P T2: S-lock P T1: convert S-lock on P to X-lock // block T2: convert S-lock on P to X-lock // block Solution is deadlock detection (or deadlock avoidance). Deadlock Detection Use "waits-for" graph and look for cycles Empirically, in actual systems the waits-for graph shows: * cycles fairly rare * cycle length usually 2, sometimes 3, virtually never >3 Upon block, start from blocked transaction. Victim selection current blocker youngest XACT least resources used fewest locks held fewest number of restarts Optimistic Concurrency Control Conflicts are rare So optimize for the common case Do not spend time getting locks etc. Instead, figure all this out at commit time. Three Phase Execution Read. Here, all writes are to private storage (shadow copies). Validation. Make sure no conflicts have occurred. Write. If Validation was successful, make writes public. (If not, abort!) Favorable Scenarios All transactions are readers. Lots of transactions, each accessing/modifying only a small amount of data, large total amount of data. Fraction of transaction execution in which conflicts "really take place" is small compared to total path length. The Validation Phase Guarantee that only serializable schedules result by actually finding an equivalent serializable schedule. Assign each transaction a TN during execution. Do this just before validation begins. Ensure that if you run transactions in order induced by "<" on TNs, you get an equivalent schedule.