Checkpointed System
Classification Key
: Core Security, Tampering
Problem
A component failure can result in loss or corruption of state information maintained by the failed component. Systems which rely on retained state for correct operation must be able to recover from loss or corruption of state information.
How can we design a system so that its state can be recovered and restored to a known valid state in case a component fails?
Solution
Create a set of states and make the system follow the state sequences in its life cycle. Store persistent state information all the time. Use a wide variety of configurations that provide the ability to restart the system from a known valid state (i.e. the checkpoint), either on the same platform or on different platforms.
Known Uses
Periodic auto-save feature in Microsoft Word.
Source
Open Group Catalog
Tags
State Machine, Graceful Restart
|