See what "ACID" is in other dictionaries. Transactions. Properties of ACID transactions. Recovery management. ARIES algorithm. Two-phase fixation

This is also a database, but manual

Hello!

Let's talk about databases. Today transactions, ACID and CAP theorem is a theory that is important for the following articles.

Transactions

A transaction is a set of data actions combined into a logical unit. It is either fully executed or not. A classic example with the operation of transferring money from account to account:

Start transaction read balance on account number 5 decrease balance by 10 currency units save new balance account number 5 read the balance on account number 7 increase the balance by 10 monetary units save the new balance of account number 7 End the transaction

If these operations are performed outside of a transaction, then an error in any of them will leave both accounts in an inconsistent state: for example, it will turn out that money is withdrawn from one account, but not received in the other. There are no such problems with a transaction - if an error occurs, all operations will be rolled back and nothing will change for the user.

It all sounds simple when a single user is working on the database: he executes transactions one after another and has no problem with one transaction interfering with another. Everything changes when there are many users.

This is what can happen when running non-isolated transactions in parallel.

Lost update
When two transactions are recorded different meanings into the same cell, one of the changes is lost.

Dirty reading
When data is read, which at that moment is changed by the transaction, and then the transaction is rolled back and the data disappears.

Non-repetitive reading
When data is read several times, which at that moment is changed by the transaction, each time the data can be rejected by others.

Phantom reading
During its execution, one transaction selects multiple rows using the same criteria several times. Another transaction, between these selections, adds or deletes rows or changes columns of some of the rows used in the selection criteria of the first transaction, and completes successfully. As a result, it turns out that the same selections in the first transaction give different sets of rows.

Transaction isolation

So that parallel transactions could be executed without interfering with each other, they came up with the concept of transaction isolation. There are four isolation levels in total, but some databases introduce their own levels.

Read uncommitted data
Most low level isolation. You can freely read uncommitted changes from other transactions, but writing is strictly sequential. This way, only the problem of lost updates is eliminated: it is guaranteed that eventually all transactions in turn will write the required value to the cell.

Typically, this is done by using a write lock on cells intended to be changed within the current transaction. No locks are placed on reading.

Read committed data
You can freely read all changes in your transaction and recorded changes in other people's transactions. Lost updates and dirty reads are eliminated, while the problems of unrepeatable reads and phantoms remain.

Repeatable read
You can read all changes only to your transaction. Data modified by other transactions is not available. The only problem that remains is phantom reads.

Serializable
Transactions are completely isolated from each other, each one executed as if no parallel transactions existed.

ACID

This is a set of four requirements for transaction system, ensuring the most reliable and predictable operation. Not all databases fully implement ACID.

Atomicity
Atomicity ensures that each transaction will be completed completely or not at all. Intermediate states are not allowed.

Consistency
Consistency is a requirement that the transaction will result in valid data. This is not a question of technology, but of business logic: for example, if the amount of money in an account cannot be negative, the transaction logic must check whether negative values ​​​​will result.

Isolation
Ensures that concurrent transactions do not affect the outcome of other transactions. We dealt with isolation above.

Durability
The changes resulting from the transaction must remain preserved regardless of any failures. In other words, if the user received a signal that the transaction was completed, he can be sure that the data was saved.

CAP theorem

It says that in distributed system Only two of the three properties can be achieved: consistency, availability, and partition resistance. Helps you understand how a particular distributed system will work and what to expect from it.

Data consistency
When in all nodes at each moment in time the data is consistent with each other, that is, does not contradict each other. If one node has data in a database cell, all other nodes have the same data.

Availability
When any request can be processed by the system, regardless of its state.

Partition tolerance
When splitting the system into several isolated sections does not lead to an incorrect response from each of the sections: the network between two nodes has fallen off, but each of them can correctly respond to its clients.

All distributed databases data in one way or another best case scenario realize two of the three properties, sacrificing the remaining ones.

The next article is about rare databases that you won’t see in regular projects. This is where this whole theory will come in handy.

7.1 ACID properties

7.1.1 During this benchmark testing, the System Under Test must have the ACID (Atomicity, Consistency, Isolation, Durability) properties that are key to transaction processing systems.

7.1.2 The purpose of this section is to provide an informal description ACID properties and specifying the series of tests that must be performed to demonstrate that the possession of these properties is assured.

7.1.3 No finite sequence of tests can prove that ACID properties are fully supported. Passing the prescribed tests is required, but not sufficient condition compliance with ACID property requirements. However, for the sake of clean reporting, only the tests specified herein will be considered necessary and only those tests should be reported on this benchmark test report.
Note: The purpose of these tests is to demonstrate that ACID principles are supported by the System Under Test and are applied during Test Execution. They are not designed to be comprehensive quality assurance tests.

7.1.4 During Test Execution, the configuration necessary to ensure full compliance with the ACID properties must be applied. This applies to both the database (including TPC-E and User defined objects) and to the Database Sessions used to conduct ACID tests and Benchmark Execution.
Note 1: The concept "configuration" includes all database properties and characteristics that can be defined externally; it includes, but is not limited to, configuration and initialization files, environment settings, commands and stored SQL procedures, downloadable modules and add-ons. For example, if the SUT is based on Undo/Return Logs, then logging must be maintained for all Transactions, including those that do not contain a rollback option in the Transaction Profile.
Note 2: In cases where this benchmark testing is implemented on a distributed system, tests must be performed to verify that Transactions processed on two or more nodes comply with ACID properties.

7.1.5 Although the ACID tests do not test all types of Transactions in this scope of work, the ACID properties must be met for all Transactions.

7.1.6 Organizers submitting TPC results must perform ACID tests on one of any of the systems for which the result is provided, provided they use the same software implementations (for example, Operating systems, DBMS, transaction programs). For example, the contents of this paragraph would be appropriate when the Results are provided for multiple systems from the same product line. However, the Resilience tests described in Clauses 7.5.2.2, 7.5.2.3 and 7.5.2.4 must be passed on all systems being assessed. All FDRs must display the systems used to verify ACID requirements, as well as Full description applied ACID tests and the results obtained in them..

7.2 Atomicity Requirements

7.2.1 Definition of the Atomicity Property

The system under test must ensure that Database Transactions are atomic; the system will either complete all individual operations on the data or ensure that none of the partially completed operations have any effect on the data in any way.

7.2.2 Atomicity Tests

Execute the market Trade-Order Transaction by setting the roll_it_back flag to 0. Verify that the appropriate rows have been added to the TRADE and TRADE_HISTORY tables.
Execute the market Trade-Order Transaction by setting the roll_it_back flag to 1. Check that no rows associated with the Trade-Order Transaction have been added to the TRADE and TRADE_HISTORY tables.

7.3 Sequence Requirements

7.3.1 Defining the Sequence property

Consistency is a property of the Application that ensures that any execution of a Database Transaction will result in it transitioning from one state to another.

7.3.1.1 The TPC-E database first populated using EgenLoader must have sequence properties.

7.3.1.2 If data is replicated as permitted in clause 2.3.3.3, each copy shall comply with the sequence state as described in clause 7.3.2.

7.3.2 Sequence States
The following paragraphs define the three sequence states. Each of the three conditions requires clear demonstration that the conditions are met.

7.3.2.1 Sequence 1 state


B_NUM_TRADES = count(*)

for each broker, defined by the following:



7.3.2.2 Sequence 2 state

Records in the BROKER and TRADE tables must respect the relationship:

B_COMM_TOTAL = sum(T_COMM)

for each broker defined by the following:

(B_ID = CA_B_ID) and (CA_ID = T_CA_ID) and (T_ST_ID = 'CMPT').

7.3.2.3 Sequence 3 state

Records in the HOLDING_SUMMARY and HOLDING tables must respect the relationship:

HS_QTY = sum(H_QTY)

for each group of assets defined by the following:

(HS_CA_ID = H_CA_ID) and (HS_S_SYMB = H_S_SYMB).

7.3.3 Sequence tests
Three-state funnel testing should be performed both after initial database seeding and after any Business Recovery tests.

7.4 Insulation requirements

7.4.1 Determination of insulation property

7.4.1.1 Considering Transaction T1 and the simultaneously executing Transaction T2, the following phenomena (P0 to P3) occurring in T1 can be described.

  • P0 (Dirty Write) - Transaction T2 modifies (or injects) the data items of R. Then, before the COMMIT actions of transaction T2 take place, Transaction T1 begins, which can modify (or delete) the data items of R and can then issue a COMMIT.
Note: T2 may perform additional database operations based on the state in which it left R's elements, creating a potential data consistency problem.
  • P1 (Dirty Read) - Transaction T2 modifies (or inserts) the data items R. Then, before T2 issues a COMMIT, Transaction T1 begins, reads the data items R and is able to retrieve the state of the data items as it was after T2 changed. Subsequently, T2 may perform a ROLLBACK operation.
Note: T1 may perform additional database operations based on the state of R's data items that has been rolled back and assumed to never have existed, creating potential problem data sequences.
  • P2 (“Non-repeatable read”) - Transaction T1 reads the data items R. Then, before Transaction T1 performs a COMMIT, Transaction T2 begins, which modifies (or deletes) the data items R and issues a COMMIT. T1 then repeats the reading of the data items R and can obtain the state of the data items after being modified by Transaction T2.
Note: Before detecting the changed (or deleted) state of R data items, T1 could perform additional database operations based on the state of R data items that are no longer considered valid, creating a potential data consistency issue.
  • P3 (“Phantom Read”) - Transaction T1 reads a set of data items that correspond to someone. Next, before Transaction T1 performs a COMMIT operation, Transaction T2 begins and inserts or deletes one or more data items corresponding to the one being used by T1. T1 then repeats the initial read on the same one and may obtain a different set of data items than the original one.
Note: Before discovering a larger or smaller set of data items, T1 may perform additional database operations based on the set of data items that is no longer considered relevant, creating a potential data consistency issue.

7.4.1.2 Isolation is a property of a Transaction that indicates the degree to which it is isolated from the actions of other Transactions running simultaneously with it. The table below, which is ordered from lowest (L0) to highest (L3) level of restriction, describes four degrees of isolation based on which event should not occur

7.4.1.3 During Test Execution, each TPC-E Transaction shall provide a degree of isolation from Arbitrary Transactions no less than the degree specified in the following table:

7.4.1.4 During Test Execution, the SUT must allow simultaneous execution of Arbitrary Transactions.

7.4.1.5 During Test Execution, the data read by each TPC-E Transaction must be no older than the most recent Verified Data at the time the Transaction began.

7.4.1.6 Systems that implement Transaction isolation using blocking or versioning techniques must demonstrate compliance with the isolation requirements by passing the tests described in Clause 7.4.2.

7.4.1.7 Systems that implement Transaction isolation using techniques other than blocking or versioning may require other techniques to demonstrate that the isolation requirements are met. It is the Test Organizer's responsibility, in conjunction with the Auditor, to identify these techniques, implement them, perform them as a demonstration of compliance with the isolation requirements, and provide sufficient information in the FDR to support the assertion that the isolation requirements have been met.

7.4.2 Insulation tests
The following isolation tests are designed to verify that the configuration and implementation of the System Under Test provides Transactions with the required degrees of isolation as described in clause 7.4.1.3.

7.4.2.1 P3 Read-Write Test

This test shall demonstrate that a Trade-Result Read-Write Transaction, while executing concurrently with another Trade-Result Read-Write Transaction, is protected from P3 Phantom Phenomenon. The second Trade-Result Transaction (Session S4 below) performs the functions of a Random Transaction, which adds a row to the HOLDING_SUMMARY table that has already been accessed by the first Trade-Result Transaction (Session S3 below).

For the purpose of this testing, these two Trade-Result Transactions must be prepared to execute the hs_qty record upon returning from Frame 1. In addition, the Trade-Result Transaction executed by S3 must be able to repeat the execution of Frame 1 and suspend its execution before the Frame begins execution 2.



1. From S1, select acct_id. After executing the specified transaction in read-only mode, find the symbol value for which there is no corresponding row in the HOLDING_SUMMARY table for the selected acct_id parameter and commit.

2. From S1, call and successfully complete the Trade-Order Transaction for the acct_id and symbol parameters selected in step 1. Record the trade_id associated with these trades.

3. From S2, request and successfully complete another Trade-Order Transaction for the acct_id and symbol parameters used in step 2. Record the trade_id associated with these trades.

4. From S3, request the trade_id from Trade-Result used in step 2. Suspend the transaction between Frame 1 and Frame 2. Record hs_qty and check that it is set to zero.

5. From S4, query the trade_id from the Trade-Result Transaction used in step 3. Verify that the transaction completes Frame 1 and begins executing Frame 2. Record hs_qty and verify that it is set to zero.

Use case A, in case S4 stops in Frame 2 and then rolls back while S3 completes:
6A. From S3, repeat execution of Frame 1 and pause again between Frame 1 and Frame 2. Write hs_qty and check that its value is set to zero.
7A. Resume execution of S3 on Frame 2. Verify that the remaining Frames complete successfully.
8A. Check that S4 has been pumped out.




6C. If such a situation occurs, the test is considered invalid. In order to correctly test phantom read protection, it is necessary for the S4 session to reach the point in Frame 2 of the Trade-Result Transaction when a new row is added to the HOLDING_SUMMARY table. It is possible that the Trade-Result Transaction used for S4 may need to be modified to prevent it from being blocked in Frame 1. For example, it may be run at a lower level of Arbitrary Transaction isolation.

Note 1: Test P3 passes if either Case A or B occurs. And it fails if Case C occurs. There may be other valid options (for example, both S3 and S4 may fail), but if both S3 and S4 write hs_qty = 0 during the execution of Frame 1, then at most one of these Sessions can complete normally and commit the Transaction. The purpose of this test is to demonstrate that under any circumstances, if S3 re-reads the HOLDING_SUMMARY table after S4 has added (or attempted to add) a new row for the selected acct_id and symbol parameters, a matching row will still not be found in S3 .
Note 2: this test isolation creates one or more new assets. Subsequent execution of Test P2 in Read-Write (see paragraph 7.4.2.2) for the same selected parameters acct_id and symbol may lead to the closure of the position created during this test.

7.4.2.2 Test P2 in Read-Write

This test shows that a Trade-Result Read-Write Transaction, while simultaneous execution of another Trade-Result Read-Write Transaction, is protected from the P2 Non-Repeatable Read phenomenon. The second Trade-Result Transaction (Session S4 below) acts as a Random Transaction that updates the row in the HOLDING_SUMMARY table that was read by the first Trade-Result Transaction (Session S3 below).

For the purpose of this testing, these two Trade-Result Transactions must be prepared to execute the hs_qty record upon returning from Frame 1. In addition, the Trade-Result Transaction executed by S3 must be able to repeat the execution of Frame 1 and suspend its execution before the Frame begins execution 2.

Using the four Sessions S1 to S4, the following steps are performed in the appropriate order.
1. From S1, select acct_id. With the specified transaction in read-only mode, find the symbol value for which there is a corresponding row in the HOLDING_SUMMARY table for the selected acct_id, record the HS_QTY for that asset, and commit.

2. From S1, request and successfully complete a Trade-Order Transaction with the acct_id and symbol parameters selected in step 1. Record the trade_id parameter associated with these trades.

3. From S2, request and successfully complete another Trade-Order Transaction with the acct_id and symbol parameters used in step 2. Record the trade_id parameter associated with these trades.

4. From S3, request a Trade-Result Transaction with the trade_id parameter obtained in step 2 and suspend execution between Frame 1 and Frame 2. Record hs_qty and check that it is equal to HS_QTY obtained in step 1.

5. From S4, request a Trade-Result Transaction with the trade_id parameter obtained in step 3. Check that it completes Frame 1 and begins execution of Frame 2. Write hs_qty and check that it is equal to HS_QTY obtained in step 1.

Use case A, if S4 stalls in Frame 2, then rolls back while S3 completes:
6A. From S3, re-call Frame 1 and pause again between Frames 1 and 2. Record hs_qty and check that it is equal to the HS_QTY obtained in step 1.
7A. Resume execution of S3 by calling Frame 2. Check for successful execution of the remaining Frames.
8A. Check that S4 has been pumped out.
Use case B, in case S4 terminates (possibly after stopping) and S3 is rolled back:
6B. Verify that S4 completes execution of Frame 2 and the remaining Frames.
7B. Check that S3 has been downloaded.
Case C if S4 stops in Frame 1 (incorrect):
6C. If such a situation occurs, the test is considered invalid. In order to correctly test security against the P2 Non-Repeatable Read event, it is necessary that the S4 session reaches the moment in Frame 2 of the Trade-Result Transaction when one of the rows is updated in the HOLDING_SUMMARY table. It is possible that the Trade-Result Transaction used for S4 may need to be modified to prevent it from being blocked in Frame 1. For example, it may be run at a lower level of Arbitrary Transaction isolation.

Note: This test passes if either Case A or B occurs. And it fails if Case C occurs. There may be other valid options (for example, both S3 and S4 may fail), but if S3 and S4 write the same hs_qty value during the execution of Frame 1, then at most one of these Sessions can complete normally and commit the Transaction. The purpose of this test is to demonstrate that under any conditions, when S3 repeats the read of the HOLDING_SUMMARY table given the acct_id and symbol, the row and value found will be the same as in step 1.


7.4.2.3 Test P1 in Read-Write

This test shows that a Trade-Result Read-Write Transaction, while another Trade-Result Read-Write Transaction is executing concurrently, is protected from a P1 dirty read event. In order to of this testing The Trade-Result transaction must be configured to execute the se_amount record upon return from Frame 5 and must be able to suspend execution in Frame 6 immediately before committing.

Using the three Sessions S1 to S3 in the appropriate order, the following steps are performed

1. From S1, call the Customer-Position Transaction on the selected cust_id parameter, complete the Transaction and write down the set of final parameters acct_id and cash_ball.

2. From S1, call and successfully complete the Trade-Order Transaction with the acct_id parameter selected from the set recorded in step 1 of given parameter symbol and with type_is_margin set to 0. Write down the trade_id assigned to these trades.

3. From S1, call and successfully complete another Trade-Order Transaction with the same acct_id parameter, but a different symbol parameter from the one used in step 2 and a type_is_margin parameter set to 0. Record the trade_id assigned to these trades.

4. From S2, call Transaction Trade-Result with the trade_id input parameter obtained in step 2. Before calling Frame 6, write se_amount, then call Frame 6 and pause before committing.

5. From S3, call Transaction Trade-Result with the trade_id input parameter received in step 3. The transaction may be suspended, may not complete successfully, or may be temporarily blocked from being fully executed. If it reaches the start of Frame 6, write se_amount, then call Frame 6. If it reaches the end of Frame 6, pause before acknowledging.

6. From S2, continue with confirmation and successful completion of the Transaction. Record the resulting value of the acct_bal parameter.

7. From S3, depending on what the transaction behavior was at the end of step 5:
If it reached a suspension in Frame 6, allow it to continue and check that it was Acknowledged and completed successfully.

If it got blocked before finishing Frame 5, check that it was unlocked, completed Frame 5, wrote se_amount, called Frame 6, was Acknowledged and completed successfully.

If it did not complete successfully and was rolled back, re-call the Trade-Result Transaction with the same trade_id input parameter. Verify that the Trade-Result Transaction executes completely, writes the se_amount value at the beginning of Frame 6, commits at the end of Frame 6, and completes successfully.

8. From S3, record the resulting acct_bal and check that it is equal to the cash_bal value from step 1 (using the acct_id selected in step 2) plus the sum of the output se_amount for these two Trade-Result Transactions.

7.4.2.4 Test P1 in Read-Write

This test shows that the Customer-Position Read-Write Transaction during the concurrent execution of the Trade-Result Read-Write Transaction is protected from the P1 dirty read event. For the purposes of this testing, the Trade-Result Transaction must be able to suspend execution in Frame 6 immediately before confirmation.

Using the four Sessions S1 to S4, the following steps are performed in the appropriate order:

1. From S1, call the Customer-Position Transaction on the selected cust_id parameter, complete the Transaction and write down the set of final parameters acct_id and cash_ball.

2. From S1, call and successfully complete a Trade-Order Transaction in which the corresponding input parameter acct_id matches one of the acct_id parameters recorded in step 1 and the value of type_is_margin is 0. Record the trade_id assigned to these trades.

3. From S2, call Transaction Trade-Result with the trade_id input parameter received in step 2 and then pause execution in Frame 6 before committing.

4. From S3, call the Customer-Position Transaction with the cust_id input parameter selected in step 1. The transaction may succeed or fail, or may be temporarily blocked from fully executing.

5. From S2, continue with the continuation and successful completion of the Trade-Result Transaction. Record the resulting acct_bal.

6. From S3 dependent on the behavior of the Customer-Position Transaction at the end of step 4:

If it completes, record the set of resulting acct_id and cash_bal and check that the cash_bal values ​​for the acct_id used in step 2 remain unchanged from the first step.

If it was blocked, check that it has now completed, record the resulting set of acct_id and cash_bal and check that the cash_bal value for the given acct_id used in step 2 matches the acct_bal from step 5.

If it has not been completed, proceed to the next step.

7. From S4, call the Customer-Position Transaction with the cust_id parameter selected in step 1, complete the Transaction, record the set of resulting parameters acct_id and cash_bal and check that the cash_bal for the given acct_id used in step 2 has changed from step 1 and reflects volume of trades executed in stage 5 (by comparison with acct_bal from stage 5).


7.5 Stability requirements

The system under test must be configured to meet the robustness requirements specified in this clause. The system under test demonstrates stability properties in the case of saving Confirmed Transactions and maintaining the database structure after the failures listed in clause 7.5.2. Resilience testing is carried out by calling Catastrophic and Non-Catastrophic failures on the SUT components. Non-catastrophic failures, described in clause 7.5.5, test the ability of the SUT to maintain the ability to access data. The catastrophic failures described in clause 7.5.6 test the SUT's ability to preserve the effects of Confirmed Transactions. The duration of the effects of a Catastrophic Failure is described in the Test Report as Business Recovery Time. None of existing systems does not provide absolute stability (i.e. stability under any failure scenarios). The specific set of single failures listed in Clause 7.5.2 is considered sufficiently indicative for asserting the presence of Resilience in cases of such failures. However, the limited nature of the tests presented should not be interpreted as allowing for the existence of other unrecoverable instances of single failures.

7.5.1 Definition of concepts

Availability: Ability to perform database operations with full access to data after a permanent irrecoverable crash of any one Permanent carrier containing database tables, recovery log data, or Database metadata. See Clause 7.5.2.1.

7.5.1.1 Restoring the application Catastrophic

7.5.1.2 Recovery time applications: Time elapsed from the start of Application Recovery to the end (see Point 0).

7.5.1.3 Business recovery: The process of restoring a business application after Catastrophic system failure and reaching a point where the business reaches certain operational criteria.

7.5.1.4 Business Recovery Time: Time elapsed from the start of Application Recovery to the end (see Clause 7.5.6.8).

7.5.1.5 Catastrophic: A type of failure in which processing is interrupted without warning given by the SUT. After this failure, for only the crashed database, all memory is cleared and all active application context is lost.

7.5.1.6 Confirmed: A Transaction is Confirmed when its actions (Add, Remove, Modify) are permanent and visible to other Transactions

7.5.1.7 Note: Transactions may be Committed without the Driver subsequently being notified of this fact, since message integrity is not necessary.

7.5.1.8.Database recovery: The process of recovering a database after a Catastrophic system failure.

7.5.1.9 Database Recovery Time: Duration from start Database recovery until the database files are completely restored.

Sustainability Assessment Requirements: conditions that the SUT must satisfy during all Stability tests (see Clause 7.5.3)


7.5.1.10 Robust: A state that can withstand failures (as described in Clause 7.5.2) and that has transactional update semantics.

7.5.1.11 Permanent carrier: Non-volatile, persistent data storage such as magnetic disk or tape.

7.5.1.12 Concept Non-catastrophic refers to a single failure that does not interrupt processing, but may degrade performance and cause the SUT to no longer be in a steady state until it recovers from the failure.

7.5.2 Single points of failure
The following points describe individual components SUTs tested by Non-Catastrophic and Catastrophic failures described in clauses 7.5.5 and 7.5.6. Single points of failure apply to SUT components required to meet Resilience requirements.

The Test Organizer may apply a single test to perform resiliency testing on multiple single points of failure (for example, a single "total system failure test" may be applied to the three failure points described in Clauses 7.5.2.2, 7.5.2.3 and 7.5.2.4 ).

7.5.2.1 Permanent unrecoverable failure of one of the Persistent media.

7.5.2.2 Instant break(system failure/system hang) in the process of processing, which requires a system reboot to recover.
Note: This also means incorrect termination work that requires downloading a new copy Operating system With boot device. It does not necessarily mean loss of data in non-volatile memory. An example of a mechanism to overcome an instantaneous abort event in processing is the Undo/Redo Log.

7.5.2.3 Whole memory space failure(loss of contents).
Note: This assumes that all memory space has failed. This may be caused by a loss of supply external power supply or permanent failure of the memory board.

7.5.2.4 Termination of external power to the SUT for an indefinite period of time (power failure). This must include at least all parts of the SUT that are involved in the Database Transaction work steps.

Note: Power failure protection requirements can be met by using enough UPS to ensure system-side availability of all components experiencing a power failure within 30 minutes. Using a UPS protected configuration should not introduce new single points of failure that are not protected by other parts of the configuration. This claim can be proven either by measuring or calculating the 30 minute electrical consumption (in Watts) of the protected portion of the SUT, multiplied by 1.4.

7.5.3 Requirements for stability assessment.

All Stability tests must meet the following requirements:

  • during a Measurement Interval be performed with the same number of Configured Users and Driver Load
  • occur in a stable state
  • satisfy the Response Time restrictions set out in clause 6.5.1.7.
  • satisfy the requirements for a Combination of Transactions listed in clause 6.3.1.
  • be executed at a performance value of 95% or higher of the Reported without errors
  • match all Driver and SUT configuration settings applied during the Measurement Interval.
7.5.4 Multiple instances of the Operating System

7.5.4.1 In configurations in which similar benchmarking functions are performed by more than one instance of the Operating System, the failure tests listed in clause 7.5.2 must be performed on at least one of those instances.

7.5.4.2 If multiple instances of the Operating System manage data that is processed as a single image from the point of view of the benchmarking applications (for example, a database cluster), then the Power Failure test must also be run simultaneously on all of these instances.

Note: Power failure of multiple instances in the SUT must occur within 3 seconds to introduce differences in the results of manual procedures that can be used to complete the test.

7.5.4.3 If more than one instance of the Operating System manages data that is processed in a benchmark application as a single image and connected by physical media other than an embedded bus (for example, a bus expansion cable, a high-speed LAN, or other method of providing communication between multiple instances of the Operating System that may be subject to physical disruption), the instantaneous interruption of such communication is included in the list of objects that must be tested under clause 7.5. 2.2. Only one of the connection instances using redundancy needs to be checked for failure.

Note: this paragraph is not intended to establish a requirement to terminate the connection with disk storage or disk subsystems with redundancy.

7.5.5 Non-catastrophic Failures

A non-catastrophic failure is any failure that does not lose data stored in SUT memory or does not require new download Operating system from boot device. Non-catastrophic failures described in this paragraph have an impact on access to data stored on the Persistent Media. These requirements are also called Data Availability requirements.

7.5.5.1 The SUT shall maintain access to data on the Persistent Media during and after a failure identified in clause 7.5.2.1 (permanent non-recoverable failure of any one Persistent Media containing database tables, recovery log data, or Database Metadata). The Test Organizer must also restore the Persistent Media environment to its pre-failure state while maintaining the ability to access the data on the Persistent Media.

7.5.5.2 Persistent media may be either volatile or non-volatile media configured appropriately to meet the requirements of clause 7.5.2.1. Non-volatile media are usually magnetic disks, using replication (RAID-1 mirroring) or other protection methods (RAID-5, etc.) that guarantee access to data during a Persistent Media failure. Volatile media, such as memory, can be used where volatile media can provide automatic transfer of data before any data is lost to non-volatile media after a power failure, regardless of when power is restored.

Note 1: Configured and evaluated Source Uninterruptible Power(UPS) does not count external source nutrition.

Note 2: Memory can be considered Permanent Storage if it can retain data long enough to meet the requirements described above, for example if it is supplemented with an Uninterruptible Power Supply and the contents of the memory can be redirected to non-volatile storage at the moment of failure. It should be noted that no distinction is made between main memory and memory that performs similar functions of permanent or temporary storage of data in other parts of the system (for example, the disk controller cache). If main memory is used as a persistent medium, it must be considered as a potential single point of failure. An example of a solution to a single Persistent media failure is to mirror the Persistent media.

If the memory is a Permanent medium, and mirroring is used to ensure stability, then the mirrored memory banks must have independent power supplies.

7.5.5.3 Data Availability Tests (also called Non-Catastrophic Failure Tests) shall comply with the Resilience Assessment Requirements in clause 7.5.3.

7.5.5.4 Levels of Redundancy
The Redundancy Level denotes the degree to which data availability is guaranteed in the event of a single failure in data storage components. The SUT must use one of next Levels Redundancies:

  • Level 1 Redundancy (Persistent Media Redundancy): Guarantees access to data on the Persistent Media in the event of a failure on one of the Persistent Media.
Note: task this level redundancy is to test the Persistent media environment to withstand the failure of one of the Persistent media and continue processing requests from the Operating System and/or DBMS.

Example: The test organizer implemented the use of RAID-1 (mirroring) on ​​disks in the storage. In the event of a failure on one of the disks, the Organizer must provide access to data on all remaining disks.

  • Level 2 Redundancy (Persistent Media Controller Redundancy): Enables Level 1 Redundancy and guarantees access to data on the Persistent Media when a single failure occurs in the storage controller used to achieve the redundancy level, or in the communications between the storage controller and the Persistent Media.
Note: The purpose of this redundancy level is to test the ability implemented scheme survive failure of the storage controller that implements Level 1 Redundancy.

Example: If Level 1 Redundancy is achieved by implementing RAID-5 protection on disk, then Level 2 Redundancy will be tested by failing the hardware used to implement RAID-5 protection.

If the controller implementing RAID-5 is contained in a disk structure (or similar externally attached device), then the Organizer must demonstrate that it is still able to access the data stored within the structure.

If the controller implementing RAID-5 is located separately from the enclosure containing the disks, and the controller is not used as a persistent medium (for example, a mirrored write cache), then it is sufficient to break the connections between the controller and the disk structure.

  • Level 3 Redundancy (Full Redundancy): Enables Level 2 Redundancy and guarantees access to data on the Persistent Media when a single failure occurs in the Persistent Media system, including the links between Layer B and the Persistent Media system.
Note 1: The Persistent Media system contains all the components necessary to meet the sustainability requirements described above. This does not include a Level B system or system bus, but does include an adapter for system bus and all components located “below” the adapter.

Note 2: The purpose of this clause is to test the ability of the Level B system to withstand failures of its components and continue to process Transactions.

Note: The components tested in this paragraph are considered to be quick replaceable modules. It is not the intent of this clause to require the Organizer to test the stability of the backplane in the storage of Persistent Media or similar non-replaceable components. At the same time, the testing tasks include checking the fault tolerance properties of the storage controllers, including the mirrored cache on the controller, and the corresponding software.

7.5.5.5 Robustness Testing Procedure for Data Availability

1. Determine the current number of completed trades in the database by running the query:
select count(*) as count1 from SETTLEMENT

2. Begin submitting Transactions for processing and increase performance to the level of the Resilience Assessment Requirements (as described in clause 7.5.3) and maintain at this level for at least 5 minutes.

Note: Once the Stability Assessment Requirements are met:

  • Any changes to the Driver configuration are prohibited until stage 5 is completed
  • Any changes to the SUT configuration are prohibited except those necessary to complete steps 3 and 4.
3. Fail as necessary to verify the level of redundancy demonstrated.

4. Get started necessary procedures recovery.

5. Continue working the Driver for 20 minutes.

6. Correctly terminate execution from the Driver.

7. Get the new number of completed trades in the database by running:



8. Compare the number of Trade-Result Transactions completed by the Driver with the value
(count2 - count1). Check that (count2 - count1) is equal to the number of entries of successful Trade-Result Transactions in the Driver log file.

9. Allow the recovery procedure to complete as expected.

7.5.6 Catastrophic failures.

Some failures can be catastrophic in nature, and in such cases, access to data is lost. The SUT must be able to save the state of the database and restore access to data in the shortest possible time.
Catastrophic failures are sudden and unpredictable in nature, often resulting in unexpected losses in transaction processing. The requirements in this clause test the SUT's ability to preserve the consequences of Confirmed Transactions. Because the ability to process transactions is key in an OLTP environment, the Test Organizer must measure and report the time it takes the DBMS to recover from Catastrophic failures. This recovery time is called Business Recovery Time.
Note: Catastrophic failures are a huge disruption to business processes, hence it is imperative for businesses to recover from such failures as quickly as possible. There are many database configuration parameters and practices that directly affect the performance of the DBMS and its recovery time after a Catastrophic failure. Although it is common knowledge that loading times are different systems can vary over a very large range, load parameters have a vanishingly small effect on DBMS performance and are not part of the Business Recovery Time.


7.5.6.1 Catastrophic Failure Tests shall comply with the Resilience Assessment Requirements given in Clause 7.5.3.

7.5.6.2 Deploy the recovery image from archival copy database (e.g., a copy made before execution), the use of Rollback/Undo log data is not acceptable as a recovery mechanism for the failures listed in clauses 7.5.2.2, 7.5.2.3, and 7.5.2.4. It should be noted that control points, Validation Points, Sequence Points (and similar) databases created while a job is running are not considered archived data.

7.5.6.3 If the recovery mechanism relies on the contents of volatile memory before the failure, then the means used to avoid loss of the contents of volatile memory (for example, an Uninterruptible Power Supply) shall be taken into account when calculating the cost of the system (see clause 8.3.1.3).

7.5.6.4 The start of database recovery is the time when database files are first accessed by a process that has knowledge of the contents of those files and is intent on recovering the database or executing Transactions that operate on the database.

Note: access to files by Operating system processes to check integrity file system or volumes to correct errors in data structures does not mean the start of Database Recovery.

7.5.6.5 Database Recovery Finish is the point at which the database files have been restored.

Note: Typically the database will indicate this time in its log files.

7.5.6.6 Start of Application Recovery is the time when the first Transaction is submitted after the start of Database Recovery.

7.5.6.7 Application Recovery End is the point at which the SUT begins to operate at performance greater than or equal to 95% of Reported and continues to do so for at least 20 minutes.

7.5.6.8 Test procedure for Catastrophic failures

1. Determine the current number of completed trades in the database by running:
select count(*) as count1 from SETTLEMENT.

2. Begin the execution of Transactions and increase performance to the level of the Resilience Assessment Requirements (as described in clause 7.5.3) and comply with these requirements for at least 20 minutes.

3. Perform one or more Catastrophic Failures from paragraphs 7.5.2.2, 7.5.2.3 or 7.5.2.4.

4. If it matches test configuration, stop sending Transactions.

5. If necessary, restart the SUT (which may involve a hard reboot).

6. Mark the start time of Database Recovery (see paragraph 7.5.6.4), either automatically or manually by the operator.

7. When the Database Recovery procedure ends, note the time. This may happen during next steps(see paragraph 7.5.6.5).

8. Start Executing Test 2 or continue sending Transactions and mark this moment as the beginning of Application Recovery (see Clause 7.5.6.6). Increase productivity to 95% of Reported Productivity.

Note: If there is a time gap between the end of the Database Restore and the start of the Application Restore, and if the Drivers and Transactions need to be restarted (rather than just continued), then a Trade-Cleanup Transaction can be started during this time period.

9. Mark the end of Application Recovery, as described in paragraph 7.5.6.7.

10. Shut down the Driver correctly.

11. Check that the Driver has not reported any errors in steps 7 to 10. This is to ensure that end user will not witness any negative effects(in addition to application availability and potentially reduced performance) due to the SUT failure and subsequent Business Recovery.

12. Get the new number of completed trades in the database by running:
select count(*) as count2 from SETTLEMENT

13. Compare the number of Trade-Result transactions completed by the Driver with the value (count2 - count1). Verify that (count2 - count1) is greater than or equal to the total number of successful Trade-Result Transaction entries in the Driver log file related to runs performed in stage 2 and stage 7.
If there is an inequality, then the SETTLEMENT table must contain additional entries and the difference must be less than or equal to maximum number Transactions that may simultaneously be in the process of being transferred from the Driver to the SUT. This number is related to the Driver implementation and configuration settings at the time of failure.

Note: This difference should only exist due to Transactions committed on the System Under Test for which, despite this, no output was returned to the Driver before failure.

14. Check the sequence conditions as specified in Clause 7.3.3.

15. Calculate the Business Recovery Time as the sum of the Application Recovery Time and the Database Recovery Time, unless these time periods overlap. If Application Recovery begins before the end of Database Recovery, then Business Recovery Time is the time elapsed between the start of Database Recovery and the end of Application Recovery.

7.5.7 Required Sustainability Reporting.

7.5.7.1 The Test Organizer shall describe the Level of Redundancy and describe the test(s) used to demonstrate compliance in the Report.

7.5.7.2 Data Availability Chart
The Report must present a graph of Measured Performance versus elapsed time for fragments of Data Availability Test execution, prepared in accordance with the following conventions:

  • The X-Axis represents the elapsed time for the test runs described in Clause 7.5.5.5, for stages 2 to 6
  • The Y Axis represents performance expressed in tpsE
Note: The purpose is to demonstrate the impact of the recovery procedure on performance.

7.5.7.3 Reported assessments
The time for business recovery should be indicated in the Final Enforcement Order and in the Report. If the failures described in Clauses 7.5.2.2, 7.5.2.3 and 7.5.2.4 were not combined into a single Resilience test (usually by cutting off power to the Database Server during execution), then the Business Recovery Time for the failure described for the sudden flash outage is - this is the Business Recovery Time that should be included in the Final Enforcement Order. All Business Recovery Time values ​​for each test requiring Business Recovery must be reported in the Report.

7.5.7.4Business Recovery Timeline
The Report shall include a graph of Measured Performance versus elapsed time for segments of Business Recovery test execution, prepared in accordance with the following arrangements:
  • The X-axis displays the maximum elapsed time for the two test runs described in paragraph 7.5.6.8 for stages 2 and 8
  • The Y-axis represents performance expressed in tpsE
  • A graph scale dimension of 1 minute should be used.
  • Y-axis data should be plotted for both runs on the same graph, with the end points of each run clearly indicated.
  • For the purpose of creating a graph, the zero reference point is defined as follows:
  • For the run described in step 2 in clause 7.5.6.8, the zero reference point is defined as the point in time at which the first Transaction is applied to the database
  • For the run described in step 8 in clause 7.5.6.8, the zero reference point is defined as the point in time at which Database Recovery begins.
  • For the purpose of creating a schedule, the end of a run is determined as follows:
  • For the run described in step 2 in clause 7.5.6.8, the end of the run is the moment at which the failure occurs (see step 3 of clause 7.5.6.8)
  • For the run described in step 8 in clause 7.5.6.8, the end of the run is the time at which Application Recovery has completed successfully (see step 8 in clause 7.5.6.8)
  • For the run described in step 8 in clause 7.5.6.8, if there is any period of time between the end of the Database Recovery and the start of the Application Recovery, this period should be ignored and the two periods should be plotted adjacent to each other.










It's no secret that in the presence of a formulated heuristic rule called CAP Theorem, as opposed to the usual RDBMS system, the class of NoSQL solutions cannot provide full support ACID. It must be said that for a number of tasks there is no need for this and supporting one of the elements leads to a compromise in resolving the others, as a result - a wide variety of existing solutions. In this article, I would like to consider various architectural approaches to solving problems of partially meeting the requirements for a transactional system.

"A" Atomicity

Atomicity ensures that no transaction is partially committed to the system. Either all of its sub-operations will be performed, or none will be performed.

NoSQL systems are usually chosen high performance not for the sake of transactional semantics, since its compliance introduces additional processing costs. Many systems still provide key- or row-level guarantees (Google BigTable) or provide an atomic operations API (Amazon DynamoDB) in which only one thread can modify a record if, for example, you want to have a user hit counter distributed across the cluster . Most systems adhere to non-blocking read-modify-write loops. The cycle consists of three stages- read the value, modify, write. As you can see, in a multi-threaded environment there are many things that can go wrong, for example, what if someone changes a record between the read and write phases. The main mechanism for resolving such conflicts is the use of the Compare and Swap algorithm - if someone changed a record during the cycle - we must understand that the record has changed and repeat the cycle until our value is established, this algorithm seems preferable to completely write-blocking mechanism. The number of such cycles can be very large, so we need a certain timeout for the operation, after which the operation will be rejected.

"C" Consistency

A transaction reaching its normal completion and thereby committing its results maintains database consistency. Considering the specificity of NoSQL to the distribution of information across servers, this means whether all replicas containing a copy of the data always contain the same version of the data

Due to its specifics, modern NoSQL must choose high availability and ability to horizontal scaling cluster - it turns out that the system cannot ensure complete data consistency and makes some assumptions in defining the concept of consistency. There are two approaches:

Strict consistency
Such systems ensure that replicas are always able to agree on one version of the data returned to the user. Some replicas will not contain this value, but when the system processes a request for a value by key, the machine will always be able to decide which value to return - it just won't always be the last one. How it works - for example we have N replicas of the same key. When a request comes to update a key value, the system will not return the result to the user until W replicas will not respond that they have received the update. When the user requests a value, the system returns a response to the user when at least R replicas returned the same value. Then we consider the system to be strictly consistent if the condition is met R+W>N. Selecting values R And W affects how many machines must respond before the response is returned to the user, usually choosing a condition R+W=N+1- minimal necessary condition to ensure strict consistency.
Possible consistency
Some systems( Voldemort, Cassandra, Riak) allow you to select R And W at which R+W . When a user requests information, there may be times when the system cannot resolve a conflict between versions of a key value. To resolve conflicts, a type of versioning called vector clock is used. This is a vector associated with each key that contains change counters for each replica. Let the server A, B And C- replicas of the same key, the vector will contain three values (N_A, N_B, N_C), originally initialized in (0,0,0) . Every time a replica changes the value of a key, it increments its counter in the vector. If B changes the value of a key that previously had a version (39, 1, 5) - the vector will change its value to (39, 2, 5) . When another replica, say C, receives update from replica B it compares the vector value with its own. As long as all of the vector's counters are less than those that came with B, the returned value is a stable version and you can overwrite your own copy. If on B And C there are vectors in which some counters are greater and some are less, for example, (39, 2, 5) And (39, 1, 6) , then the system identifies the conflict.

The resolution of this conflict varies on different systems; Voldemort returns multiple copies of the value, leaving the conflict to be resolved by the user. Two versions of a user's shopping cart on a site can be merged without loss of information, whereas merging two versions of the same editable document requires user intervention. Cassandra, which stores the timestamp of each record, returns the latest one if a conflict is detected; this approach does not allow merging two versions without losing information, but it simplifies the client part.

Cassandra, since version 1.1, ensures that if you upgrade:

UPDATE Users
SET login="login" AND password="password"
WHERE key="key"

Then no concurrent reading will see a partial update of the data (login has changed, but password has not), and this is only true at the level of rows that are within the same column family and having a common key. This may correspond to the transaction isolation level read uncommitted, in which conflicts are resolved lost update. But Cassandra does not provide a rollback mechanism at the cluster level, for example, a situation is possible when login and password will be saved on a certain number of nodes, but not enough W in order to give the user the correct result, the user is forced to resolve this conflict himself. The mechanism for ensuring isolation is that for each record that is changed, an invisible, isolated version for clients is created, which subsequently automatically replaces the old version through the Compare and Swap mechanisms described above.

"D" Reliability

Regardless of problems at lower levels (for example, system blackout or hardware failures), changes made by a successfully completed transaction should remain saved after the system is returned to operation. In other words, if the user received confirmation from the system that the transaction was completed, he can be sure that the changes he made will not be undone due to some failure.

The most predictable failure scenario is a power outage or server restart. A fully reliable system in this case should not return a response to the user until it writes all changes from memory to the hard disk. Writing to disk takes too long and many NoSQL systems make compromises for the sake of performance.

Ensuring reliability within a single server
A standard disk can withstand 70-150 operations per second, which is a throughput of up to 150 MB/s, ssd - 700 MB/s, DDR - 6000 - 17000 MB/s. Therefore, ensuring reliability within a single server while ensuring high performance is to reduce the number of random access writes and increase sequential writes. Ideally, the system should minimize the number of entries between calls fsync(synchronization of data in memory and on disk). Several techniques are used for this.
Frequency control fsync
Redis offers several ways to configure when to call fsync. You can configure it to be called after every record change - which is the slowest and safest choice. To improve performance, you can trigger a disk flush every N seconds, in the worst case you will lose data in N last seconds, which may be quite acceptable for some users. If reliability is not at all critical, then you can disable fsync and rely on the system itself at some point synchronizing the memory with the disk.
Increasing sequential writing through logging
To effectively search data, NoSQL systems often use additional structures, for example, B-trees to build indexes; working with it causes multiple random accesses to the disk. To reduce this, some systems ( Cassandra, HBase, Riak) add update operations to a sequentially written file called redo log. While some structures are written to disk quite rarely, the log is written often. After a crash, missing records can be restored using the log.
Increasing throughput by grouping records
Cassandra groups multiple simultaneous changes within a short window that can be combined into one fsync. This approach, called group commit, increases the response time for one user, because he is forced to wait for several other transactions to record his own. The advantage here is obtained by increasing the overall throughput, because multiple random entries can be merged.
Ensuring reliability within a server cluster
Due to the possibility of unexpected disk and server failures, it is necessary to distribute information across several machines.
Redis represents a classic master-slave architecture for data replication. All operations associated with the master go down to replicas in the form of a log.
MongoDB is a structure in which a given number of servers stores each document, and it is possible to set the number of servers W , described above, which is the minimum required to record and return control to the user.
HBase achieves multi-server reliability through the use of a distributed file system HDFS.

In general, one can notice a certain tendency in modern NoSQL tools towards providing greater data consistency. But still, SQL and NoSQL tools can exist and develop in parallel and solve completely different problems.

You may have already heard the term like transaction . If not or have forgotten, then let us remind you that under transaction refers to a certain sequence of operations (insert, update or delete actions) in the database that must be performed as part of a single whole.

We all use the services of banks and carry out various transactions with funds. How to achieve reliability in this case? The security requirements for performing operations in such systems are increased. Now imagine that you are transferring funds to someone you know. You want the funds to be debited from your account and transferred to the recipient's account. But what happens in the event of a power failure, such as a power failure or other emergency? Who guarantees you that you will not find yourself in a situation where money has left your account, but the transfer to the recipient’s account has not occurred. Help prevent such situations transactions.

Committing and rolling back a transaction.

A transaction combines several statements SQL. If any failure occurs as a result of the execution of at least one of the requests, the transaction is immediately rolled back ( rollback). If everything is fine and all actions within one transaction were completed successfully, then commit is performed ( commit) transactions and only in this case the corresponding changes are written to the database.

Now let's demonstrate an example of using transactions in MySQL DBMS(in most popular DBMS this mechanism is not fundamentally different):

Let us have a test table accounts, which stores information about user accounts. And one field that interests us is called balance. When transferring from one account to another, funds should be debited in one account and credited in another (for practice purposes, you can create a small database yourself).

The transaction starts with a keyword BEGIN. Two operators UPDATE inside transactions are part of the same transaction and can be executed either both at the same time or neither of them (a rollback must occur). Team COMMIT serves to indicate the end of a transaction and the committing of changes. If the commit is successful, then all changes in the source database will be displayed.

To roll back a transaction, you can explicitly write the command ROLLBACK.

ACID properties.

So, we got acquainted with the concept of transaction, commit and rollback. Now let's look at four very important properties of a transaction that people often like to ask about in interviews. Requirements ACID in the late 70s formulated by Jim Gray (winner of the Turing Award for his contribution to the development of databases). And this is what they sound like:

  • Atomicity- atomicity. What we talked about above. the transaction is atomic, that is, all actions performed within one transaction are a single whole.
  • Consistency— consistency. Each new transaction moves the database from one consistent state to another. If your database is distributed, then all its replicas must contain the same version of the data to ensure availability. This rule is often violated by many NoSQL databases.
  • Isolation- isolation. Transactions do not affect each other. In case of parallel execution, partial data updating should not occur.
  • Durability— reliability. Do you want all the changes made by this transaction to be saved in case of failures? That is why this list has the property of reliability.