Until science is able to predict when and where hurricanes, tornadoes,
and earthquakes will strike, every business that has a large amount of networking
equipment needs to have a plan in placebefore disaster strikes.
In 1992 Hurricane Andrew put 39 major data centers out of commission. And in
1993, the World Trade Center bombing caused 21 data centers to shut down.
While you don't like to think about it, every organization, regardless of its
size, runs the risk of a major systems outage, such as a tornado demolishing
a data center or a building fire destroying the facility and everything in it.
A study by the University of Texas found that 85 percent of businesses depend
totally or heavily on information technology systems to stay in business, and
that a loss of those systems would cost businesses up to 40 percent of their
daily revenues.
Disaster can strike at any time
There are more than 35 types of disasters, ranging from the most common, such
as power outages, to the most catastrophic, such as earthquakes. In essence,
a disaster includes any type of interruption of service that results from some
force beyond the organization's control. Disaster recovery provides systematic
procedures for how to react to and how to recover from that ominous external
or internal force. Disaster recovery planning, which complements business continuity
and contingency planning, ensures the ability of the organization to function
effectively if an unforeseen event severely disrupted normal operations.
The following checklist will help the key individuals in your organization
to go through the thought process for preparing a disaster recovery plan. The
objective is to restore all critical business functions, rather than
such disparate functions as only the data center.
Gather Information and Organize the Project
A successful initiative of this magnitude requires support from senior management
associated with the organization, a dedicated disaster recovery team whose members
have knowledge of critical business systems, and a well thought out planning
strategy and testing strategy.
Senior executives responsible for disaster recovery planning will perform the
first two steps. The disaster recovery coordinator, working with the appropriate
team leaders, should perform steps 3 to 7 (steps 6 and 7 are covered in part
2 of this article).
Determine which senior executive(s) will have overall responsibility for
disaster recovery.
Have this executive appoint disaster recovery coordinator.
Appoint a disaster recovery team leader for each operational unit, such
as server backup or telephone system.
Convene disaster recovery planning team and sub-teams as appropriate.
Working with senior executives responsible for disaster recovery, the disaster
recovery coordinator should identify the following:
Scopethe areas to be covered by the disaster recovery plan
Objectiveswhat is worked towards and what is the course of action
that the disaster recovery team intends to follow
Assumptionswhat is being taken for granted or accepted as true without
proof?
Conduct Business Impact Analysis
The disaster recovery planning team should perform this step to identify
which business departments, functions, or systems are most vulnerable to
potential threats, what are the potential types of threat, and what effect
would each identified potential threat have on each of the vulnerable areas
within the organization.
Identify functions, processes, and systems.
Interview information systems support personnel.
Interview business unit personnel.
Analyze results to determine critical systems, applications, and business
processes.
Prepare impact analysis on interruption on critical systems.
Conduct Risk Assessment
The disaster recovery planning team should work with the organization's
technical and security person to determine the probability of each functional
business units' critical systems becoming severely disrupted and to document
the amount of acceptable risk the business unit can tolerate. For each critical
system, provide the following information:
Review physical security, i.e. secure office, building access off hours,
etc.
Review backup systems and data security.
Review policies on personnel termination and transfer.
Identify systems supporting mission critical functions.
Identify vulnerabilities, such as physical attacks, or acts of God,
such as floods.
Assess probability of system failure or disruption.
Prepare risk and security analysis.
Develop Strategic Outline for Recovery
The steps outlined here provide all of the components necessary to perform
a recovery. These steps will help pull together information about the operations
of all systems, especially those owned or managed by non-technical managers
with help from technical support personnel. Steps one through four mainly
apply to functional business units that manage technology systems to process
critical functions. The disaster planning recovery team and the functional
business unit may wish to appoint other appropriate individuals to perform
subsequent tasks.
Assemble groups as appropriate for the following:
Hardware and operating systems
Communications
Applications
Facilities
Other critical functions and business processes as identified in
the Business Impact Analysis step.
For each system/process above quantify the following processing requirements.
Light, normal, and heavy processing days
Transaction volumes
Dollar volume, if any
Estimated process time
Allowable delays (days, hours, minutes, etc.)
Detail all the steps in your workflow for each critical business functions.
(For example, for payroll processing include each step that must be complete
and the order in which to complete them.
Identify systems and applications.
Component name and technical identification if any
Type (online, batch process, script)
Frequency
Run time
Allowable delay (days, hours, minutes, etc.)
Identify all vital records.
Name and description
Type (backup, original, master, history)
Where are they stored?
Source of item or record
Can the record be easily replaced by another source?
Backup and backup generation frequency
Number of backup generations available onsite and off-site
Location of backups
Media key, retention period, rotation cycle
Who is authorized to retrieve the backups?
Identify if a severe disruption occurred what would be the minimum requirements
or replacement of the critical function during the disruption.
Type (server hardware, software, research materials, etc.
Item name and description
Quantity required
Location of inventory, alternative, or off-site storage
Vendor/supplier
Identify if alternative methods of process either exist or could be
developed, quantifying on processing (include manual processes).
Identify person(s) who support the system or the application.
Identify primary person to contact if system or application cannot function
as normal.
Identify secondary person to contract if system or application cannot
function as normal.
Identify all vendors associated with the system or application.
Document business unit strategy during recovery (conceptually how will
the unit function?).
Quantify resources required for recovery by time frame.
Develop and document recovery strategy, including priorities for recovering
system/function components, and recovery schedule.
Review Backup and Recovery Procedures
The disaster recovery planning team should review both on-site and
off-site procedures to provide for a current backup of critical program
and data that can be used in the even of a disaster. To this end, the disaster
recovery planning time can reduce downtime and speed recovery.
Review current records (operating systems, code).
Review current off-site storage facility or arrange for one.
Review backup and off-site backup storage policy or create one.
Present to functional business unit leader for approval.
Select Alternate Facility
The disaster recovery should perform the task of looking for a location,
other than the normal facility, used to process data and or conduct business,
in the event of a disaster.
Determine resource requirements.
Assess platform uniqueness of unit systems (Macintosh, IBM, Oracle,
etc.).
Identify alternative facilities.
Review cost/benefit.
Evaluate and make recommendation.
Present to business unit leader for approval.
Make selection.
In part 2, we will cover Plan Development, Testing, and Ongoing Maintenance
for your disaster recover plan.