ACID Properties in DBMS
· ACID
· Atomicity (A)
∘ Atomicity in Distributed Systems:
· Consistency (C)
∘ Consistency in Data
∘ Consistency in Reads (CAP theorem) / Consistency in Distributed Systems:
· Isolation
∘ Read phenomena
∘ Dirty Read / Temporary update
∘ Non-Repeatable Reads
∘ Phantom Reads
∘ Lost Updates
· Transaction isolation levels
· Durability
∘ Durability in OS
∘ Durability Techniques
∘ Durability in Distributed Systems:
· ACID Tradeoff
· Advantages of ACID Properties in DBMS: [source]
· Disadvantages of ACID Properties in DBMS: [source]
· Responsibility for maintaining properties
· Further reads
ACID
ACID is a set of properties of database transactions that are used to provide guarantees around the expected behavior of transactions in the event of errors, power failures etc.
Atomicity (A)
Atomicity is also known as the ‘All or nothing rule’.
- this property guarantees that a transaction composed of multiple operations is treated as a single unit.
- This means that either all operations of the transaction are executed or none of them is.
- It involves the following two operations.
— Abort: If a transaction aborts, changes made to database are not visible.
— Commit: If a transaction commits, changes made are visible.
Example — Consider the following transaction T consisting of T1 and T2: Transfer of 100 from account X to account Y.
If the transaction fails after completion of T1 but before completion of T2.( say, after write(X) but before write(Y)), then amount has been deducted from X but not added to Y. This results in an inconsistent database state. Therefore, the transaction must be executed in entirety in order to ensure correctness of database state.
Atomicity in Distributed Systems:
- the system might need to execute the same operation in multiple nodes of the system in an atomic way so that the operation is either executed to all the nodes or none. [more covered in distributed transactions.]
Consistency (C)
- Consistency in Data
- Consistency in Reads
Consistency in Data
- This property guarantees that a transaction can only transition the database from one valid state to another valid state, maintaining any database invariants.
→ these invariants are application-specific and defined by every application accordingly. - For example, if an application has a table A with records that refer to records in table B through a foreign key relationship, the database will prevent the transaction from deleting a record from table A, unless any records in table B are referenced from this record have already been deleted.
- A consistent transaction will not violate integrity constraints placed on the data by the database rules. Enforcing consistency ensures that if a database enters into an illegal state (if a violation of data integrity constraints occurs) the process will be aborted and changes rolled back to their previous, legal state.
ensures that transactions only make changes to tables in predefined, predictable ways. Transactional consistency ensures that corruption or errors in your data do not create unintended consequences for the integrity of your table.
- This means that integrity constraints must be maintained so that the database is consistent before and after the transaction. It refers to the correctness of a database.
- Example — The total amount before and after the transaction must be maintained.
Total before T occurs = 500 + 200 = 700.
Total after T occurs = 400 + 300 = 700.
Therefore, the database is consistent. Inconsistency occurs in case T1 completes but T2 fails. As a result, T is incomplete.
Consistency in Reads (CAP theorem) / Consistency in Distributed Systems:
- The consistency in read is when you update a value and then a transaction tries to read that value after it was committed, you get the old version and that’s an inconsistent result.
- most databases heal from this kind of inconsistency by eventual consistency.
Isolation
- this property guarantees that even though transactions might be running concurrently and have data dependencies, the end result will be as if one of them was executing at a time, so that there was no interference between them.
- prevents a large number of anomalies.
- Isolated transactions are considered to be “serializable”, meaning each transaction happens in a distinct order without any transactions occurring in tandem.
- This property ensures that multiple transactions can occur concurrently without leading to the inconsistency of database state. Transactions occur independently without interference.
- Changes occurring in a particular transaction will not be visible to any other transaction until that particular change in that transaction is written to memory or has been committed.
- This property ensures that the execution of transactions concurrently will result in a state that is equivalent to a state achieved these were executed serially in some order.
- Let X= 500, Y = 500.
Consider two transactions T and T”.
- Suppose T has been executed till Read (Y) and then T’’ starts. As a result , interleaving of operations takes place due to which T’’ reads correct value of X but incorrect value of Y and sum computed by
T’’: (X+Y = 50, 000+500=50, 500)
is thus not consistent with the sum at end of transaction:
T: (X+Y = 50, 000 + 450 = 50, 450).
This results in database inconsistency, due to a loss of 50 units. Hence, transactions must take place in isolation and changes should be visible only after they have been made to the main memory.
Read phenomena
Dirty Read / Temporary update
A dirty read happens when a transaction reads data that has been written by another running transaction and that data is subsequently rolled back. As a result, the first transaction ends up reading data that technically doesn’t exist.
- A dirty read occurs when a transaction reads data that has not yet been committed.
- For example, suppose transaction 1 updates a row. Transaction 2 reads the updated row before transaction 1 commits the update. If transaction 1 rolls back the change, transaction 2 will have read data that is considered never to have existed.
Non-Repeatable Reads
- A nonrepeatable read occurs when a transaction reads the same row twice but gets different data each time.
- For example, suppose transaction 1 reads a row. Transaction 2 updates or deletes that row and commits the update or delete. If transaction 1 rereads the row, it retrieves different row values or discovers that the row has been deleted.
Phantom Reads
- A phantom is a row that matches the search criteria but is not initially seen.
- For example, suppose transaction 1 reads a set of rows that satisfy some search criteria. Transaction 2 generates a new row (through either an update or an insert) that matches the search criteria for transaction 1. If transaction 1 re-executes the statement that reads the rows, it gets a different set of rows.
Lost Updates
Transaction isolation levels
Durability
- this property guarantees that once a transition has been committed, it will remain committed even in the case of failure.
- In the context of single-node, centralized systems, this usually means that completed transactions (and their effects) are recorded in non-volatile storage.
ensures that changes to your data made by successfully executed transactions will be saved, even in the event of system failure.
- In a nutshell, is the process of persisting the writes that clients make to the database to a non-volatile system storage system.
Durability in OS
- OS doesn’t write to disk, it writes to its own memory cache. And the reason is because it will want to batch these writes and flush it to disk at once for performance reasons because most applications does a lot of writes and batch write do less IO (less IO is better)
- When the writes go the OS cache, an OS crash, machine restart could lead to loss of data
- Fsync OS command forces writes to always go to disk. fsync can be expensive and slows down commits
Durability Techniques
→ Write Ahead Log (WAL)
- Writing a lot of data to disk is expensive (indexes, data files, columns, rows, etc.)
- That is why DBMSs persist a compressed version of the changes known as WAL (write-ahead-log segments)
→ Asynchronous snapshot
→ AOF (Append Only File)
Durability in Distributed Systems:
- in the context of distributed systems, this might mean that transactions need to be durably stored in multiple nodes so that recovery is possible even in the presence of total failures of a node along with its storage facilities.
ACID Tradeoff
Advantages of ACID Properties in DBMS: [source]
- Data Consistency: ACID properties ensure that the data remains consistent and accurate after any transaction execution.
- Data Integrity: ACID properties maintain the integrity of the data by ensuring that any changes to the database are permanent and cannot be lost.
- Concurrency Control: ACID properties help to manage multiple transactions occurring concurrently by preventing interference between them.
- Recovery: ACID properties ensure that in case of any failure or crash, the system can recover the data up to the point of failure or crash.
Disadvantages of ACID Properties in DBMS: [source]
- Performance: The ACID properties can cause a performance overhead in the system, as they require additional processing to ensure data consistency and integrity.
- Scalability: The ACID properties may cause scalability issues in large distributed systems where multiple transactions occur concurrently.
- Complexity: Implementing the ACID properties can increase the complexity of the system and require significant expertise and resources.
Overall, the advantages of ACID properties in DBMS outweigh the disadvantages. They provide a reliable and consistent approach to data - management, ensuring data integrity, accuracy, and reliability. However, in some cases, the overhead of implementing ACID properties can cause performance and scalability issues. Therefore, it’s important to balance the benefits of ACID properties against the specific needs and requirements of the system.
Responsibility for maintaining properties
Further reads
- https://www.geeksforgeeks.org/concurrency-problems-in-dbms-transactions/
- https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/transaction-isolation-levels?view=sql-server-ver16
- https://tarunjain07.medium.com/acid-properties-in-dbms-c91a92c09b09
- https://www.geeksforgeeks.org/acid-model-vs-base-model-for-database/