Disaster recovery and business continuity auditing
Encyclopedia
Disaster recovery
and business
continuity refers to an organization’s ability to recover from a disaster
and/or unexpected event and resume or continue operations. Organizations should have a plan in place (usually referred to as a "Disaster Recovery Plan", or "Business Continuity Plan") that outlines how this will be accomplished. The key to successful disaster recovery is to have a plan (emergency plan, disaster recovery plan, continuity plan) well before disaster ever strikes.
Some of the key metrics to be measured in a disaster recovery environment is the Recovery Time Objective
(RTO) and Recovery Point Objective
(RPO ). RTO is a metric that measures the time that it takes for a system to be completely up and running in the event of a disaster. RPO measures the ability to recover files by specifying a point in time restore of the backup copy.
When conducting an audit
of a disaster recovery plan several factors should be considered. These are described below.
This can be accomplished through:
In other words, this needs to be a living and breathing program, so to speak, that is audited and updated on a regular basis as changes are identified that could affect the personnel and or area that has been devastated.
if the current facility is unusable. The difference between the two is that a hot site is fully equipped to resume operations while a cold site does not have that capability. There is also what is referred to as a warm site which has the capability to resume some, but not all operations. The decision a company makes when determining what type of site to establish depends on a cost-benefit analysis
and the needs of the individual organization. The plan should also spell out how relocation to a new facility is to be conducted. A company should have occasional tests and conduct trials to verify the viability and effectiveness of the plan and to determine if any deficiencies exist and how they can be dealt with. An audit
of a company Disaster Recovery Plan should primarily look into the probability that operations of the organization can be sustained at the level that is assumed in the plan, as well as the ability of the entity to actually establish operations at the site.
The auditor should:
and systems can help minimize the impact of threats. Even so, the plan should also include information
on how best to recover any data
that has not been copied. Controls and protections should be in place to ensure that data is not damaged, altered, or destroyed during this process. Information technology
experts and procedures need to be identified that can accomplish this endeavor. Vendor manuals can also assist in determining how best to proceed.
and effective processes. The security
of the storage site also needs to be confirmed.
, with a deputy manager
who has the capability to take over the responsibilities if needed. The qualities needed for this position vary depending upon the organization.
The qualities of the project manager generally include:
Other members of the team need to have a clear understanding and ability to perform the requisite procedures. An auditor needs to examine and assess the project and deputy project manager
’s training, experience, and abilities as well as to analyze the capabilities of the team members to complete assigned tasks and that more than one individual is trained and capable of doing a particular function. Tests and inquiries of personnel can help achieve this objective.
coverage (particularly property
and casualty insurance
) through a review of the company's insurance policies
and other research. Among the items that the auditor needs to verify are: the scope of the policy (including any stated exclusions), that the amount of coverage is sufficient to cover the organization’s needs, and that the policy is current and in force. The auditor should also ascertain, through a review of the ratings assigned by independent rating agencies, that the insurance company or companies providing the coverage have the financial viability to cover the losses in the event of a disaster.
and the recovery team should have Disaster Recovery Procedures which allow for effective communication
. This can be accomplished by making sure contact information is easily accessible and drills conducted test communication
abilities. Procedures should include non-technological as well as technological methodologies in case of power or system failures. Communication
s between the organization and outside individuals and organizations also need to be taken into account when designing the plan. Procedures to test this communication
ability generally mirror those of the organization itself. The auditor should evaluate these procedures and assumptions to determine if they are reasonable and likely to be effective.
An auditor evaluation can be accomplished through:
’s assertions.
, and dealing with family emergencies should be clearly written and tested. This can generally be accomplished by the company through good training
programs and a clear definition of job responsibilities.
The auditor can verify this is accomplished through:
with employees should be used to substantiate this. There must also be confirmation that the personnel backups can actually do the duties assigned to them in an event of an emergency. Periodic training
can also help alleviate this. This training
should include updates to existing job positions and testing to confirm proficiency.
The auditor needs to verify that:
can also help the auditor obtain a better understanding of the organization’s environment. An auditor should examine this to determine what the objectives, priorities, and goals of the plan are.
cannot be utilized. The plan should indicate what procedures to be used in this situation and should also include information
on storage of flashlight
s and candle
s, as well as additional safety
procedures in case of gas leaks, fire
s or other phenomena. Trial runs should be conducted to test the procedures' effectiveness and viability.
The auditor should:
s that minimize against any legal liability for lack of performance in the event of disaster
or any other unusual circumstance? Agreements pertaining to establishing support and assisting with recovery for the entity should also be outlined.
The auditor should:
, the individual or team should make use of various other procedures and processes to achieve the objectives of the audit
. These objectives should be clearly stated in the audit
plan. Certification to the British Standard on Business Continuity BS 25999 is available from BSI.
Disaster recovery
Disaster recovery is the process, policies and procedures related to preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster. Disaster recovery is a subset of business continuity...
and business
Business
A business is an organization engaged in the trade of goods, services, or both to consumers. Businesses are predominant in capitalist economies, where most of them are privately owned and administered to earn profit to increase the wealth of their owners. Businesses may also be not-for-profit...
continuity refers to an organization’s ability to recover from a disaster
Disaster
A disaster is a natural or man-made hazard that has come to fruition, resulting in an event of substantial extent causing significant physical damage or destruction, loss of life, or drastic change to the environment...
and/or unexpected event and resume or continue operations. Organizations should have a plan in place (usually referred to as a "Disaster Recovery Plan", or "Business Continuity Plan") that outlines how this will be accomplished. The key to successful disaster recovery is to have a plan (emergency plan, disaster recovery plan, continuity plan) well before disaster ever strikes.
Some of the key metrics to be measured in a disaster recovery environment is the Recovery Time Objective
Recovery Time Objective
The recovery time objective is the duration of time and a service level within which a business process must be restored after a disaster in order to avoid unacceptable consequences associated with a break in business continuity....
(RTO) and Recovery Point Objective
Recovery point objective
-Recovery point objective :When computers used for normal "production" business services are affected by a "Major Incident" that cannot be fixed quickly, then the Information Technology Service Continuity Plan is performed, by the ITSC recovery team...
(RPO ). RTO is a metric that measures the time that it takes for a system to be completely up and running in the event of a disaster. RPO measures the ability to recover files by specifying a point in time restore of the backup copy.
When conducting an audit
Audit
The general definition of an audit is an evaluation of a person, organization, system, process, enterprise, project or product. The term most commonly refers to audits in accounting, but similar concepts also exist in project management, quality management, and energy conservation.- Accounting...
of a disaster recovery plan several factors should be considered. These are described below.
Written disaster recovery plan with continual updating
To be effective the plan must be written, must be understandable, and must be accessible to those who need it when they need it. Because of the constant changes that occur in the modern business environment, a plan should be updated frequently to deal with new and existing threats as they develop. The auditor needs to determine if procedures stated in the plan to achieve these ends are actually used in practice.This can be accomplished through:
- Direct observation of procedures
- Examination of the disaster recovery plan
- Inquiries of personnel
In other words, this needs to be a living and breathing program, so to speak, that is audited and updated on a regular basis as changes are identified that could affect the personnel and or area that has been devastated.
Designated hot site or cold site
A hot/cold site is a location that an organization can move to after a disasterDisaster
A disaster is a natural or man-made hazard that has come to fruition, resulting in an event of substantial extent causing significant physical damage or destruction, loss of life, or drastic change to the environment...
if the current facility is unusable. The difference between the two is that a hot site is fully equipped to resume operations while a cold site does not have that capability. There is also what is referred to as a warm site which has the capability to resume some, but not all operations. The decision a company makes when determining what type of site to establish depends on a cost-benefit analysis
Cost-benefit analysis
Cost–benefit analysis , sometimes called benefit–cost analysis , is a systematic process for calculating and comparing benefits and costs of a project for two purposes: to determine if it is a sound investment , to see how it compares with alternate projects...
and the needs of the individual organization. The plan should also spell out how relocation to a new facility is to be conducted. A company should have occasional tests and conduct trials to verify the viability and effectiveness of the plan and to determine if any deficiencies exist and how they can be dealt with. An audit
Audit
The general definition of an audit is an evaluation of a person, organization, system, process, enterprise, project or product. The term most commonly refers to audits in accounting, but similar concepts also exist in project management, quality management, and energy conservation.- Accounting...
of a company Disaster Recovery Plan should primarily look into the probability that operations of the organization can be sustained at the level that is assumed in the plan, as well as the ability of the entity to actually establish operations at the site.
The auditor should:
- Examine and test the procedures involved
- Conduct outside research relating to Disaster recoveryDisaster recoveryDisaster recovery is the process, policies and procedures related to preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster. Disaster recovery is a subset of business continuity...
- Determine reasonable standards relating to implementation
- Tour, examine, and research the outside facility.
Ability to recover data and systems
The continual backing up of dataData
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...
and systems can help minimize the impact of threats. Even so, the plan should also include information
Information
Information in its most restricted technical sense is a message or collection of messages that consists of an ordered sequence of symbols, or it is the meaning that can be interpreted from such a message or collection of messages. Information can be recorded or transmitted. It can be recorded as...
on how best to recover any data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...
that has not been copied. Controls and protections should be in place to ensure that data is not damaged, altered, or destroyed during this process. Information technology
Information technology
Information technology is the acquisition, processing, storage and dissemination of vocal, pictorial, textual and numerical information by a microelectronics-based combination of computing and telecommunications...
experts and procedures need to be identified that can accomplish this endeavor. Vendor manuals can also assist in determining how best to proceed.
Processes for frequent backup of systems and data
The auditor should determine if these processes are effective and are actually being implemented by personnel. This can be accomplished through:- Direct observation of the processes
- Analyzing and researching the equipment used
- Conducting computerComputerA computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...
assisted auditAuditThe general definition of an audit is an evaluation of a person, organization, system, process, enterprise, project or product. The term most commonly refers to audits in accounting, but similar concepts also exist in project management, quality management, and energy conservation.- Accounting...
techniques and tests - Examination of paper and paperless records
Tests and drills of disaster procedures
Practice drills should be conducted periodically to determine how effective the plan is and to determine what changes may be necessary. The auditor’s primary concern here is verifying that these drills are being conducted properly and that problems uncovered during these drills are addressed and procedures designed to deal with these potential deficiencies are implemented and tested to determine their effectiveness.Data and system backups stored offsite
The auditor can verify this through paper and paperless documentation and actual physical observation. Testing of the backups and procedures should be done to confirm data integrityData integrity
Data Integrity in its broadest meaning refers to the trustworthiness of system resources over their entire life cycle. In more analytic terms, it is "the representational faithfulness of information to the true state of the object that the information represents, where representational faithfulness...
and effective processes. The security
Security
Security is the degree of protection against danger, damage, loss, and crime. Security as a form of protection are structures and processes that provide or improve security as a condition. The Institute for Security and Open Methodologies in the OSSTMM 3 defines security as "a form of protection...
of the storage site also needs to be confirmed.
Appointed disaster recovery committee and chairperson
The entity needs to appoint individuals responsible for designing and implementing the plan when needed. Generally, this consists of a team headed by a project managerProject manager
A project manager is a professional in the field of project management. Project managers can have the responsibility of the planning, execution, and closing of any project, typically relating to construction industry, architecture, computer networking, telecommunications or software...
, with a deputy manager
Management
Management in all business and organizational activities is the act of getting people together to accomplish desired goals and objectives using available resources efficiently and effectively...
who has the capability to take over the responsibilities if needed. The qualities needed for this position vary depending upon the organization.
The qualities of the project manager generally include:
- Good leadershipLeadershipLeadership has been described as the “process of social influence in which one person can enlist the aid and support of others in the accomplishment of a common task". Other in-depth definitions of leadership have also emerged.-Theories:...
abilities - Strong knowledge of company business
- Strong knowledge of managementManagementManagement in all business and organizational activities is the act of getting people together to accomplish desired goals and objectives using available resources efficiently and effectively...
processes - Experience and knowledge in Information technologyInformation technologyInformation technology is the acquisition, processing, storage and dissemination of vocal, pictorial, textual and numerical information by a microelectronics-based combination of computing and telecommunications...
and securitySecuritySecurity is the degree of protection against danger, damage, loss, and crime. Security as a form of protection are structures and processes that provide or improve security as a condition. The Institute for Security and Open Methodologies in the OSSTMM 3 defines security as "a form of protection... - Good project managementProject managementProject management is the discipline of planning, organizing, securing, and managing resources to achieve specific goals. A project is a temporary endeavor with a defined beginning and end , undertaken to meet unique goals and objectives, typically to bring about beneficial change or added value...
skills
Other members of the team need to have a clear understanding and ability to perform the requisite procedures. An auditor needs to examine and assess the project and deputy project manager
Project manager
A project manager is a professional in the field of project management. Project managers can have the responsibility of the planning, execution, and closing of any project, typically relating to construction industry, architecture, computer networking, telecommunications or software...
’s training, experience, and abilities as well as to analyze the capabilities of the team members to complete assigned tasks and that more than one individual is trained and capable of doing a particular function. Tests and inquiries of personnel can help achieve this objective.
Visibly listed emergency telephone numbers
The auditor can verify through direct observation that emergency telephone numbers are listed and easily accessible in the event of a disaster.Insurance
The auditor should determine the adequacy of the company's insuranceInsurance
In law and economics, insurance is a form of risk management primarily used to hedge against the risk of a contingent, uncertain loss. Insurance is defined as the equitable transfer of the risk of a loss, from one entity to another, in exchange for payment. An insurer is a company selling the...
coverage (particularly property
Property insurance
Property insurance provides protection against most risks to property, such as fire, theft and some weather damage. This includes specialized forms of insurance such as fire insurance, flood insurance, earthquake insurance, home insurance or boiler insurance. Property is insured in two main...
and casualty insurance
Casualty insurance
Casualty insurance, often equated to liability insurance, is used to describe an area of insurance not directly concerned with life insurance, health insurance, or property insurance. It is mainly used to describe the liability coverage of an individual or organization's for negligent acts or...
) through a review of the company's insurance policies
Insurance contract
In insurance, the insurance policy is a contract between the insurer and the insured, known as the policyholder, which determines the claims which the insurer is legally required to pay. In exchange for payment, known as the premium, the insurer pays for damages to the insured which are caused by...
and other research. Among the items that the auditor needs to verify are: the scope of the policy (including any stated exclusions), that the amount of coverage is sufficient to cover the organization’s needs, and that the policy is current and in force. The auditor should also ascertain, through a review of the ratings assigned by independent rating agencies, that the insurance company or companies providing the coverage have the financial viability to cover the losses in the event of a disaster.
Procedures allowing effective communication
ManagementManagement
Management in all business and organizational activities is the act of getting people together to accomplish desired goals and objectives using available resources efficiently and effectively...
and the recovery team should have Disaster Recovery Procedures which allow for effective communication
Communication
Communication is the activity of conveying meaningful information. Communication requires a sender, a message, and an intended recipient, although the receiver need not be present or aware of the sender's intent to communicate at the time of communication; thus communication can occur across vast...
. This can be accomplished by making sure contact information is easily accessible and drills conducted test communication
Communication
Communication is the activity of conveying meaningful information. Communication requires a sender, a message, and an intended recipient, although the receiver need not be present or aware of the sender's intent to communicate at the time of communication; thus communication can occur across vast...
abilities. Procedures should include non-technological as well as technological methodologies in case of power or system failures. Communication
Communication
Communication is the activity of conveying meaningful information. Communication requires a sender, a message, and an intended recipient, although the receiver need not be present or aware of the sender's intent to communicate at the time of communication; thus communication can occur across vast...
s between the organization and outside individuals and organizations also need to be taken into account when designing the plan. Procedures to test this communication
Communication
Communication is the activity of conveying meaningful information. Communication requires a sender, a message, and an intended recipient, although the receiver need not be present or aware of the sender's intent to communicate at the time of communication; thus communication can occur across vast...
ability generally mirror those of the organization itself. The auditor should evaluate these procedures and assumptions to determine if they are reasonable and likely to be effective.
An auditor evaluation can be accomplished through:
- Testing of procedures
- An inquiry of all employees
- Comparisons to other company plans and industry standards
- Examination of company manuals and other written procedures
Updated system and operation documentation confirmation
Adequate records need to be retained by the organization. The auditor should physically examine records, billings, and contracts to verify this. Outside research such as contacting vendors may also be conducted to determine the reasonableness of managementManagement
Management in all business and organizational activities is the act of getting people together to accomplish desired goals and objectives using available resources efficiently and effectively...
’s assertions.
Emergency procedures
Procedures for the stocking of food and water, capabilities of administering CPR/first aidFirst aid
First aid is the provision of initial care for an illness or injury. It is usually performed by non-expert, but trained personnel to a sick or injured person until definitive medical treatment can be accessed. Certain self-limiting illnesses or minor injuries may not require further medical care...
, and dealing with family emergencies should be clearly written and tested. This can generally be accomplished by the company through good training
Training
The term training refers to the acquisition of knowledge, skills, and competencies as a result of the teaching of vocational or practical skills and knowledge that relate to specific useful competencies. It forms the core of apprenticeships and provides the backbone of content at institutes of...
programs and a clear definition of job responsibilities.
The auditor can verify this is accomplished through:
- Inquires of personnel
- Physical observation
- Examination of training records and any certifications
Backup of key personnel positions
Clearly written policies and specific communicationCommunication
Communication is the activity of conveying meaningful information. Communication requires a sender, a message, and an intended recipient, although the receiver need not be present or aware of the sender's intent to communicate at the time of communication; thus communication can occur across vast...
with employees should be used to substantiate this. There must also be confirmation that the personnel backups can actually do the duties assigned to them in an event of an emergency. Periodic training
Training
The term training refers to the acquisition of knowledge, skills, and competencies as a result of the teaching of vocational or practical skills and knowledge that relate to specific useful competencies. It forms the core of apprenticeships and provides the backbone of content at institutes of...
can also help alleviate this. This training
Training
The term training refers to the acquisition of knowledge, skills, and competencies as a result of the teaching of vocational or practical skills and knowledge that relate to specific useful competencies. It forms the core of apprenticeships and provides the backbone of content at institutes of...
should include updates to existing job positions and testing to confirm proficiency.
The auditor needs to verify that:
- Policies are being enforced
- Testing is effective
- Training is adequate.
Hardware and software vendor list
Copies of this should be periodically updated and stored on and off site, as well as being accessible by those who require them. An auditor should test the procedures used to meet this objective and determine their effectiveness.Mission statement
This should clearly identify what the purpose and goals of the Disaster Recovery Plan are. The mission statementMission statement
A mission statement is a statement of the purpose of a company or organization. The mission statement should guide the actions of the organization, spell out its overall goal, provide a path, and guide decision-making...
can also help the auditor obtain a better understanding of the organization’s environment. An auditor should examine this to determine what the objectives, priorities, and goals of the plan are.
Both manual and automated procedures in place
Procedures in place to accomplish the needed objectives should take into account the possibility of power failures or other situations in which technologyTechnology
Technology is the making, usage, and knowledge of tools, machines, techniques, crafts, systems or methods of organization in order to solve a problem or perform a specific function. It can also refer to the collection of such tools, machinery, and procedures. The word technology comes ;...
cannot be utilized. The plan should indicate what procedures to be used in this situation and should also include information
Information
Information in its most restricted technical sense is a message or collection of messages that consists of an ordered sequence of symbols, or it is the meaning that can be interpreted from such a message or collection of messages. Information can be recorded or transmitted. It can be recorded as...
on storage of flashlight
Flashlight
A flashlight is a hand-held electric-powered light source. Usually the light source is a small incandescent lightbulb or light-emitting diode...
s and candle
Candle
A candle is a solid block or cylinder of wax with an embedded wick, which is lit to provide light, and sometimes heat.Today, most candles are made from paraffin. Candles can also be made from beeswax, soy, other plant waxes, and tallow...
s, as well as additional safety
Safety
Safety is the state of being "safe" , the condition of being protected against physical, social, spiritual, financial, political, emotional, occupational, psychological, educational or other types or consequences of failure, damage, error, accidents, harm or any other event which could be...
procedures in case of gas leaks, fire
Fire
Fire is the rapid oxidation of a material in the chemical process of combustion, releasing heat, light, and various reaction products. Slower oxidative processes like rusting or digestion are not included by this definition....
s or other phenomena. Trial runs should be conducted to test the procedures' effectiveness and viability.
The auditor should:
- Examine and test procedures for reasonableness
- Make inquiries on personnel
- Conduct outside research
Contractual agreements with external agencies/companies
The plan needs to take into account the extent of its responsibilities to other entities and their ability to make those commitments in lieu of a major event. Are their clauses in contractContract
A contract is an agreement entered into by two parties or more with the intention of creating a legal obligation, which may have elements in writing. Contracts can be made orally. The remedy for breach of contract can be "damages" or compensation of money. In equity, the remedy can be specific...
s that minimize against any legal liability for lack of performance in the event of disaster
Disaster
A disaster is a natural or man-made hazard that has come to fruition, resulting in an event of substantial extent causing significant physical damage or destruction, loss of life, or drastic change to the environment...
or any other unusual circumstance? Agreements pertaining to establishing support and assisting with recovery for the entity should also be outlined.
The auditor should:
- Examine the reasonableness of the plan
- Determine whether it takes all factors into account
- Verify the contractContractA contract is an agreement entered into by two parties or more with the intention of creating a legal obligation, which may have elements in writing. Contracts can be made orally. The remedy for breach of contract can be "damages" or compensation of money. In equity, the remedy can be specific...
s and agreements through documentation and outside research
Summary
In conducting the auditAudit
The general definition of an audit is an evaluation of a person, organization, system, process, enterprise, project or product. The term most commonly refers to audits in accounting, but similar concepts also exist in project management, quality management, and energy conservation.- Accounting...
, the individual or team should make use of various other procedures and processes to achieve the objectives of the audit
Audit
The general definition of an audit is an evaluation of a person, organization, system, process, enterprise, project or product. The term most commonly refers to audits in accounting, but similar concepts also exist in project management, quality management, and energy conservation.- Accounting...
. These objectives should be clearly stated in the audit
Audit
The general definition of an audit is an evaluation of a person, organization, system, process, enterprise, project or product. The term most commonly refers to audits in accounting, but similar concepts also exist in project management, quality management, and energy conservation.- Accounting...
plan. Certification to the British Standard on Business Continuity BS 25999 is available from BSI.
See also
- Disaster recoveryDisaster recoveryDisaster recovery is the process, policies and procedures related to preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster. Disaster recovery is a subset of business continuity...
- Information technology auditInformation technology auditAn information technology audit, or information systems audit, is an examination of the management controls within an Information technology infrastructure. The evaluation of obtained evidence determines if the information systems are safeguarding assets, maintaining data integrity, and operating...
- Information technology audit - operations
- Business continuity planningBusiness continuity planningBusiness continuity planning “identifies [an] organization's exposure to internal and external threats and synthesizes hard and soft assets to provide effective prevention and recovery for the organization, whilst maintaining competitive advantage and value system integrity”. It is also called...
External links
- The American Institute of Certified Public Accountants (AICPA)
- Information Systems Audit and Control Association (ISACA)
- Association of Information Technology Professionals (AITP)
- Institute of Internal Auditors (IIA)
- International Association for Computer Information Systems (IACIS)
- Information Systems Security Association (ISSA)
- International Disaster Recovery Association (IDRA)
- Business Recovery Managers Association (BRMA)
- British Standards Institute (BSI)