Database Recovery

What is recovery?

  • Recovery is the process of restoring a database to the correct state in the event of a failure.
  • It ensures that the database is reliable and remains in consistent state in case of a failure.

  • Database recovery can be classified into two parts;

    1. Rolling Forward applies redo records to the corresponding data blocks.
    2. Rolling Back applies rollback segments to the datafiles. It is stored in transaction tables.

  • We can recover the database using Log–Based Recovery.

Log-Based Recovery

  • Logs are the sequence of records, that maintain the records of actions performed by a transaction.
  • In Log – Based Recovery, log of each transaction is maintained in some stable storage. If any failure occurs, it can be recovered from there to recover the database.
  • The log contains the information about the transaction being executed, values that have been modified and transaction state.
  • All these information will be stored in the order of execution.
Example:
Assume, a transaction to modify the address of an employee. The following logs are written for this transaction,

Log 1: Transaction is initiated, writes 'START' log.
Log: <Tn START>

Log 2: Transaction modifies the address from 'Pune' to 'Mumbai'.
Log: <Tn Address, 'Pune', 'Mumbai'>

Log 3: Transaction is completed. The log indicates the end of the transaction.
Log: <Tn COMMIT>

There are two methods of creating the log files and updating the database,
1. Deferred Database Modification
2. Immediate Database Modification

1. In Deferred Database Modification, all the logs for the transaction are created and stored into stable storage system. In the above example, three log records are created and stored it in some storage system, the database will be updated with those steps.

2. In Immediate Database Modification, after creating each log record, the database is modified for each step of log entry immediately. In the above example, the database is modified at each step of log entry that means after first log entry, transaction will hit the database to fetch the record, then the second log will be entered followed by updating the employee's address, then the third log followed by committing the database changes.

Recovery with Concurrent Transaction

  • When two transactions are executed in parallel, the logs are interleaved. It would become difficult for the recovery system to return all logs to a previous point and then start recovering.
  • To overcome this situation 'Checkpoint' is used.
Checkpoint
  • Checkpoint acts like a benchmark.
  • Checkpoints are also called as Syncpoints or Savepoints.
  • It is a mechanism where all the previous logs are removed from the system and stored permanently in a storage system.
  • It declares a point before which the database management system was in consistent state and all the transactions were committed.
  • It is a point of synchronization between the database and the transaction log file.
  • It involves operations like writing log records in main memory to secondary storage, writing the modified blocks in the database buffers to secondary storage and writing a checkpoint record to the log file.
  • The checkpoint record contains the identifiers of all transactions that are active at the time of the checkpoint.
Recovery
  • When concurrent transactions crash and recover, the checkpoint is added to the transaction and recovery system recovers the database from failure in following manner,

  • 1. Recovery system reads the log files from end to start checkpoint. It can reverse the transaction.
    2. It maintains undo log and redo log.
    3. It puts the transaction in the redo log if the recovery system sees a log <Tn, Commit>.
    4. It puts the transaction in undo log if the recovery system sees a log with <Tn,  Start>.

  • All the transactions in the undo log are undone and their logs are removed.
  • All the transactions in the redo log and their previous logs are removed and then redone before saving their logs.