Free Essay

Database Recovery and Cloud Services

In: Computers and Technology

Submitted By sonam08
Words 5950
Pages 24
Home
Skip to content
Skip to navigation
Skip to footer

Cisco.com Worldwide Home
Products & Services Support How to Buy

Training & Events Partners
Search

Worldwide [change]
Log In
Account
Register
My CiscoClick to open
High Availability
Disaster Recovery: Best Practices

HOME SUPPORT TECHNOLOGY SUPPORT AVAILABILITY HIGH AVAILABILITY TECHNOLOGY INFORMATION TECHNOLOGY WHITE PAPER Disaster Recovery: Best Practices

Downloads

Disaster Recovery: Best Practices

Contents
1 Executive Summary
2 Disaster Recovery Planning
2.1 Identification and Analysis of Disaster Risks/Threats
2.2 Classification of Risks Based on Relative Weights
2.2.1 External Risks
2.2.2 Facility Risks
2.2.3 Data Systems Risks
2.2.4 Departmental Risks
2.2.5 Desk-Level Risks
2.3 Building the Risk Assessment
2.4 Determining the Effects of Disasters
2.4.1 List of Disaster Affected Entities
2.4.2 Downtime Tolerance Limits
2.4.3 Cost of Downtime
2.4.4 Interdependencies
2.5 Evaluation of Disaster Recovery Mechanisms
2.6 Disaster Recovery Committee
3 Disaster Recovery Phases
3.1 Activation Phase
3.1.1 Notification Procedures
3.1.2 Damage Assessment
3.1.3 Activation Planning
3.2 Execution Phase
3.2.1 Sequence of Recovery Activities
3.2.2 Recovery Procedures
3.3 Reconstitution Phase
4 The Disaster Recovery Plan Document
4.1 Document Contents
4.2 Document Maintenance
5 Reference

1 Executive Summary
Disasters are inevitable but mostly unpredictable, and they vary in type and magnitude. The best strategy is to have some kind of disaster recovery plan in place, to return to normal after the disaster has struck. For an enterprise, a disaster means abrupt disruption of all or part of its business operations, which may directly result in revenue loss. To minimize disaster losses, it is very important to have a good disaster recovery plan for every business subsystem and operation within an enterprise.
This paper discusses an approach for creating a good disaster recovery plan for a business enterprise. The guidelines are generic in nature, hence they can be applied to any business subsystem within the enterprise.
In the IT subsystem, disaster recovery is not the same as high availability. Though both concepts are related to business continuity, high availability is about providing undisrupted continuity of operations whereas disaster recovery involves some amount of downtime, typically measured in days. This paper focuses only on disaster recovery.
Every business disaster has one or more causes and effects. The causes can be natural or human or mechanical in origin, ranging from events such as a tiny hardware or software component's malfunctioning to universally recognized events such as earthquakes, fire, and flood. Effects of disasters range from small interruptions to total business shutdown for days or months, even fatal damage to the business.
The process of preparing a disaster recovery plan begins by identifying these causes and effects, analyzing their likelihood and severity, and ranking them in terms of their business priority. The ultimate results are a formal assessment of risk, a disaster recovery plan that includes all available recovery mechanisms, and a formalized Disaster Recovery Committee that has responsibility for rehearsing, carrying out, and improving the disaster recovery plan.
When a disaster strikes, the normal operations of the enterprise are suspended and replaced with operations spelled out in the disaster recovery plan. Figure 1 depicts the cycle of stages that lead through a disaster back to a state of normalcy.

Figure 1. Enterprise Operations Cycle of Disaster Recovery
It takes the enterprise some time to assess the exact effects of the disaster. Only when these are assessed and the affected systems are identified can a recovery process begin. The disaster recovery system cannot replace the normal working system forever, but only supports it for a short period of time. At the earliest possible time, the disaster recovery process must be decommissioned and the business should return to normalcy.
The disaster recovery plan does not stop at defining the resources or processes that need to be in place to recover from a disaster. The plan should also define how to restore operations to a normal state once the disaster's effects are mitigated. Finally, ongoing procedures for testing and improving the effectiveness of the disaster recovery system are part of a good disaster recovery plan.
In summary, the disaster recovery plan should (1) identify and classify the threats/risks that may lead to disasters, (2) define the resources and processes that ensure business continuity during the disaster, and (3) define the reconstitution mechanism to get the business back to normal from the disaster recovery state, after the effects of the disaster are mitigated. An effective disaster recovery plan plays its role in all stages of the operations as depicted above, and it is continuously improved by disaster recovery mock drills and feedback capture processes.
The second section of this paper explains the methods and procedures involved in the disaster recovery planning process. The third section explains the different phases of disaster recovery. And the fourth section explains what information the disaster recovery plan should contain and how to maintain the disaster recovery plan.

2 Disaster Recovery Planning
This section explains the various procedures/methods involved in planning disaster recovery.

2.1 Identification and Analysis of Disaster Risks/Threats
The first step in planning recovery from unexpected disasters is to identify the threats or risks that can bring about disasters by doing risk analysis covering threats to business continuity. Risk analysis (sometimes called business impact analysis) involves evaluating existing physical and environmental security and control systems, and assessing their adequacy with respect to the potential threats.
The risk analysis process begins with a list of the essential functions of the business. This list will set priorities for addressing the risks. Essential functions are those whose interruption would considerably disrupt the operations of the business and may result in financial loss.
These essential functions should be prioritized based on their relative importance to business operations. For example, in the case of a telecom service provider, though both billing operations and CRM/helpdesk operations are essential functions, CRM/helpdesk is less essential than billing. Hence, mitigating the risks that affect billing operations should be given more priority than CRM/helpdesk operations.
While evaluating the risks, it is also useful to consider the attributes of a risk (Figure 2).

Figure 2. Risk Attributes
The scope of a risk is determined by the possible damage, in terms of downtime or cost of lost opportunities. In evaluating a risk, it is essential to keep in mind the options around that risk, such as time of the day or day of the week, that can affect its scope. For example, spilling several gallons of toxic liquid across an assembly line area during working hours is a different situation than the same spill at night or during the weekend. While the time taken and cost to clean up the area are the same in both cases, the first case may require shutting down the assembly line area, which adds downtime cost to this event.
The magnitude of a risk may be different considering the affected component, its location, and the time of occurrence. The effects of a disaster that strikes the entire enterprise are different from the effects of a disaster affecting a specific area, office, or utility within the company.

2.2 Classification of Risks Based on Relative Weights
When evaluating risks, it is recommended to categorize them into different classes to accurately prioritize them. In general, risks can be classified in the following five categories.

2.2.1 External Risks
External risks are those that cannot be associated with a failure within the enterprise. They are very significant in that they are not directly under the control of the organization that faces the damages. External risks can be split into four subcategories:
Natural: These disasters are on top of the list in every disaster recovery plan. Typically they damage a large geographical area. To mitigate the risk of disruption of business operations, a recovery solution should involve disaster recovery facilities in a location away from the affected area. Nowadays most of the meteorological threats can be forecasted, hence the chances to mitigate effects of some natural disasters are considerable. Nevertheless is important to consider documenting the scope of these natural risks in as much detail as possible.
Human caused: These disasters include acts of terrorism, sabotage, virus attacks, operations mistakes, crimes, and so on. These also include the risks resulting from manmade structures. These may be caused by both internal and external persons.
Civil: These risks typically are related to the location of the business facilities. Typical civil risks include labor disputes ending in strikes, communal riots, local political instability, and so on. These again may be internal to the company or external.
Supplier: These risks are tied to the capacity of suppliers to maintain their level of services in a disaster. It is appropriate that a backup supplier pool be maintained in case of emergency.

2.2.2 Facility Risks
Facility risks are risks that affect only local facilities. While evaluating these risks, the following essential utilities and commodities need to be considered.
Electricity: To analyze the power outage risk, it is important to study the frequency of power outage and the duration of each outage. It is also useful to determine how many powers feeds operate within the facility and if necessary make the power system redundant.
Telephones: Telephones are a particularly crucial service during a disaster. A key factor in evaluating risks associated with telephone systems is to study the telephone architecture and determine if any additional infrastructure is required to mitigate the risk of losing the entire telecommunication service during a disaster.
Water: There are certain disaster scenarios where water outages must be considered very seriously, for instance the impact of a water cutoff on computer cooling systems.
Climate Control: Losing the air conditioning or heating system may produce different risks that change with the seasons.
Fire: Many factors affect the risk of fire, for instance the facility's location, its materials, neighboring businesses and structures, and its distance from fire stations. All of these and more must be considered during risk evaluation.
Structural: Structural risks may be related to design flaws, defective material, or poor-quality construction or repairs.
Physical Security: Security risks have gained attention in recent years, and nowadays security is a mandatory 24-hour measure to protect each and every asset of the company from both outsiders and employees. Different secure access and authorization procedures, manual as well as automated ones, are enforced in enterprises. Factors such as workplace violence, bomb threats, trespassing, sabotage, and intellectual property loss are also considered during the security risk analysis.

2.2.3 Data Systems Risks
Data systems risks are those related to the use of shared infrastructure, such as networks, file servers, and software applications that could impact multiple departments. A key objective in analyzing these risks is to identify all single points of failure within the data systems architecture.
Data systems risks can also be due to inappropriate operation processes. Operations that have run for a long period of time on obsolete hardware or software are a major risk given the lack of spares or support. Recovery from this type of failure may be lengthy and expensive due to the need to replace or update software and equipment and retrain personnel.
Data systems risks may be evaluated within the following subcategories:

• Data communication network

• Telecommunication systems and network

• Shared servers

• Virus

• Data backup/storage systems

• Software applications and bugs

2.2.4 Departmental Risks
Departmental risks are the failures within specific departments. These would be events such as a fire within an area where flammable liquids are stored, or a missing door key preventing a specific operation.
An effective departmental risk assessment needs to consider all the critical functions within that department, key operating equipment, and vital records whose absence or loss will compromise operations. Unavailability of skilled personnel also can be a risk. The department should have necessary plans to have skilled backup personnel in place.

2.2.5 Desk-Level Risks
Desk-level risks are all the risks that can happen that would limit or stop the day-to-day personal work of an individual employee. The assessment at this layer may feel a little like an exercise in paranoia. Every process and tool that makes up the personal job must be examined carefully and accounted as essential.

2.3 Building the Risk Assessment
Once the evaluation of the major risk categories is completed, it is time to score and sort all of them, category by category, in terms of their likelihood and impact. The scoring process can be approached by preparing a score sheet, as shown in Table 1, that has the following keys:

• Groups are the subcategories of the main risk category.

• Risks are the individual risks under each group that can affect the business.

• Likelihood is estimated on a scale from 0 to 10, with 0 being not probable and 10 highly probable. The likelihood that something happens should be considered in a long plan period, such as 5 years.

• Impact is estimated on a scale from 0 to 10, with 0 being no impact and 10 being an impact that threatens the company's existence. Impact is highly sensitive to time of day and day of the week.

• Restoration Time is estimated on a scale from 1 to 10. A higher value would mean longer restoration time hence the priority of having a Disaster Recovery mechanism for this risk is higher.

Table 1. Risk Assessment Form

Risk Assessment Form

External risks

Date:

Likelihood

Impact

Restoration Time

Score

Grouping

Risk

0 - 10

0 - 10

1 - 10

Natural disasters

Earthquake

1

9

10

90

Tornado

0

0

10

0

Severe thunderstorm

0

Hail

8

3

9

216

Snow/ice/blizzard

9

5

8

360

Human caused risks

Sabotage or act of terror

Bridge collapse

Water leakage in facility

Civil issues

Riot

Labor stoppage and picketing

Suppliers

Power supplier

Transportation vendor

Looking at the above example, multiplying the likelihood time, impact time, and restoration time yields a rough risk analysis score. A zero value within one of the two columns makes the total risk score a zero. Sorting the table in descending order will put the biggest risks to the top, and these are the risks that deserve more attention.

2.4 Determining the Effects of Disasters
Once the disaster risks have been assessed and the decision has been made to cover the most critical risks, the next step is to determine and list the likely effects of each of the disasters. These specific effects are what will need to be covered by the disaster recovery process.
Simple "one cause multiple effects" diagrams (Figure 3) can be used as tools for specifying the effects of each of the disasters.

Figure 3. Disaster Effects Diagram
Note that multiple causes can produce the same effects, and in some cases the effects themselves may be the causes of some other effects.

2.4.1 List of Disaster Affected Entities
The intention of this exercise is to produce a list of entities affected by failure due to disasters, which need to be addressed by the disaster recovery plan. In Figure 3, the entities that fail due to the earthquake disaster are office facility, power system, operations staff, data systems, and telephone system. Table 2 provides a sample mapping of the cause, effects, and affected entities.

Table 2. Determination of Disaster Affected Entities

Risk (Disaster)

Effect of Disaster

Disaster Affected Entity

Earthquake

Office space destroyed

Office space

Operators cannot report to work

Office staff

Power disruption

Power

Data systems destroyed

Data systems

Desktops destroyed

Desktops and workstations

Telecom failure

Telephone instruments and network

Power supply cut

Power disruption

Power

Data systems powered off

Data systems

Desktops powered off

Desktops/workstations

Data network down

Network devices and links

Telecom failure

Telephone instruments and network

It may be noticed that two or more disasters may affect the same entities, and it can be determined which entities are affected most often. The entities with the most appearances in the table have a greater tendency of failure occurrence.

2.4.2 Downtime Tolerance Limits
Once the list of entities that possibly fail due to various types of disasters is prepared, the next step is to determine what is the downtime tolerance limit for each of the entities. This information becomes crucial for preparing the recovery sequence in the disaster recovery plan. The entities with less downtime tolerance limit should be assigned higher priorities for recovery. One metric for evaluating the downtime tolerance limit is the cost of downtime.

2.4.3 Cost of Downtime
The cost of downtime is the main key to calculate the investment needed in a disaster recovery plan. Downtime costs can be divided into tangible and intangible costs.
Tangible costs are those costs that are a consequence of a business interruption, generating loss of revenue and productivity.
Intangible costs include lost opportunities when customers would approach competitors, loss of reputation, and similar factors.

2.4.4 Interdependencies
How the disaster affected entities depend upon each other is crucial information for preparing the recovery sequence in the disaster recovery plan. For example, having the data systems restored has a dependency on the restoration of power.

2.5 Evaluation of Disaster Recovery Mechanisms
Once the list of affected entities is prepared and each entity's business criticality and failure tendency is assessed, it is time to analyze various recovery methods available for each entity and determine the best suitable recovery method for each. This step defines the resources employed in recovery and the process of recovery. Some of the typical entities are data systems, power, data network, and telephone systems. For each of these there are one or more recovery mechanisms in practice in the industry.
In the case of data systems, for example, the recovery mechanism usually involves having the critical data systems replicated somewhere else in the network and putting them online with the latest backed up data available. For less critical data systems, there may be an option to have spare server hardware, and if required these servers could be configured with the required application. Depending on the data system, there may be options of autorecovery or manual recovery, and the cost and recovery time factors of each mechanism vary.
In the case of power, options such as multiple power suppliers or having alternate sources of power such as diesel generators may be suitable. In certain cases, new mechanisms may need to be devised.
Considering multiple options and variations of disaster recovery mechanisms available, it is necessary to carefully evaluate the best suitable recovery mechanism for an affected entity in a particular organization. The main factors that need to be considered are:

• Cost of deployment, maintenance, and operation

• Recovery time

• Ease of recovery activation and operation

2.6 Disaster Recovery Committee
Disaster recovery operations and procedures should be governed by a central committee. This committee should have representation from all the different company agencies with a role in the disaster recovery process, typically management, finance, IT (multiple technology leads), electrical department, security department, human resources, vendor management, and so on.
The Disaster Recovery Committee creates the disaster recovery plan and maintains it. During a disaster, this committee ensures that there is proper coordination between different agencies and that the recovery processes are executed successfully and in proper sequence.
The Disaster Recovery Committee should be authorized and responsible for:

• Creating and maintaining the disaster recovery plan

• Detecting and announcing disaster events within the company

• Activating the disaster recovery plan

• Executing the disaster recovery plan

• Monitoring the disaster situation continuously and returning operations to normal at the earliest feasible time

• Restoring normal operations and shutting down disaster recovery operations

• Continuously improving the disaster recovery plan by conducting periodic mock trials and incorporating lessons learned into the plan after an actual disaster
The roles, responsibilities, and reporting hierarchy of different committee members should be clearly defined both during normal operations and in the case of a disaster emergency. Backup members should also be designated in case of the primary member's unavailability.
Note that not all the members of the Disaster Recovery Committee may actively participate in the actual disaster recovery. But several key members of the committee, such as the operations manager, operations coordinator, and the respective operations team leads, will always actively participate.

3 Disaster Recovery Phases
Disaster recovery happens in the following sequential phases:

1. Activation Phase: In this phase, the disaster effects are assessed and announced.

2. Execution Phase: In this phase, the actual procedures to recover each of the disaster affected entities are executed. Business operations are restored on the recovery system.

3. Reconstitution Phase: In this phase the original system is restored and execution phase procedures are stopped.

3.1 Activation Phase
A disruption or emergency may happen with or without notice. A hurricane affecting a specific geographic area, or a virus spread expected on a certain date are examples of disasters with advance notice. However, there may be no warning of the burst of a water pipe or a human criminal act.
Quick and precise detection of a disaster event and having an appropriate communication plan are the key for reducing the effects of the incoming emergency; in some cases it may give enough time to allow system personnel to implement actions gracefully, thus reducing the impact of the disaster.
The Disaster Recovery Committee is responsible for launching the activation phase. It should be well informed about the geographical, political, social, and environmental events that may pose threats to the company's business operations. It should have trusted information sources in the different agencies to forestall false alarms or overreactions to hoaxes.
The activation phase involves:

• Notification procedures

• Damage assessment

• Disaster recovery activation planning

3.1.1 Notification Procedures
The notification procedure defines the primary measures taken as soon as a disruption or emergency has been detected or definitely predicted. At the end of this phase, recovery staff will be ready to execute contingency actions to restore system functions on a temporary basis. Procedures should contain the process to alert recovery personnel during business and nonbusiness hours. After the disaster detection, a notification should be sent to the damage assessment team, so that they can assess the real damage occurred and implement subsequent actions.
Notification can take place by telephone, pager, e-mail, or cell phone. A notification policy must describe procedures to be followed when specific personnel cannot be contacted. Notification procedures should be documented clearly in the contingency plan.
A general notification technique is a call tree (Figure 4). The call tree should document primary and alternate contact methods and should include procedures to be followed if an individual cannot be contacted.

Figure 4. Call Tree Chart
Staff to be alerted should be unmistakably identified in the contact list in the plan. This list should classify personnel by their role, name, and contact information (home, work, and pager numbers, e-mail addresses, and home addresses). If disrupted systems have interconnection with external organizations, a point of contact should be identified in those organizations.
Notification information may contain the following:

• Nature of the emergency that has occurred or is imminent

• Loss of life or injuries

• Damage estimates

• Response and recovery details

• Where and when to assemble for briefing or further response instructions

• Instructions to prepare for relocation for estimated time period

• Instructions to complete notifications using the call tree (if applicable)

3.1.2 Damage Assessment
To establish how the contingency plan will be executed following a service disruption, it is crucial to evaluate the nature and degree of the damage to the system. This damage evaluation should be done as quickly as conditions permit, with personnel safety given highest priority. Consequently, when possible, the damage assessment team is the first team notified of the incident.
It is worthwhile to prepare damage assessment guidelines for investigating different types of major alarms that may progress to a disaster. An example might be a sudden power outage noticed in a data center facility that has a UPS backup. The investigation may determine whether the power can be restored before the UPS system runs out of battery power, in which case activating the disaster recovery plan is not necessary, or otherwise, in which case the plan may be activated immediately.
Damage assessment procedures vary with each particular emergency; nevertheless, the following may be considered in general:

• Origin of the emergency or disruption

• Potential for additional disruptions or damage

• Area affected by the emergency

• Status of physical infrastructure

• Inventory and functional status of the most important equipment

• Type of damage to equipment

• Items to be replaced

• Estimated time to restore normal services if disaster procedures were not in place

3.1.3 Activation Planning
While it is beneficial to detect a disaster at its earliest stage, putting a disaster recovery process into action for a false alarm may stall normal business operations and result in undue costs. Hence it is very important that disaster recovery be activated only when a thorough damage assessment has been conducted.
The disaster recovery plan should have one or more criteria for activation, which become the primary input for evaluating whether the plan should be activated for each affected system. Also, it should be determined whether activating disaster response will bring systems back on line faster than standard procedures.
Depending on the extent of the damage from the disaster, the entire Disaster Recovery Committee or a part of the committee may do the disaster activation planning. The outcome of this planning, at a minimum, should be:

• List of systems and services that need to be restored

• Their interdependencies and sequence of restoration

• Time estimations for each restoration (documented in the plan)

• Instructions for reporting failures to the team leads

• Plan for communication between teams
Once the disaster activation is planned, the appropriate team leads will notify staff and start their respective activities in sequence as they have been instructed.

3.2 Execution Phase
Recovery operations start just after the disaster recovery plan has been activated, appropriate operations staff have been notified, and appropriate teams have been mobilized. The activities of this phase focus on bringing up the disaster recovery system. Depending on the recovery strategies defined in the plan, these functions could include temporary manual processing, recovery and operation on an alternate system, or relocation and recovery at an alternate site.

3.2.1 Sequence of Recovery Activities
The recovery procedure reflects priorities previously analyzed during the activation planning phase. For instance, if a server room has been recovered after a disruption, the most critical servers should be restored before other, less critical servers. The procedures should also include instructions to coordinate with other teams when certain situations occur, such as:

• An action is not accomplished within the estimated time frame

• A key step has been completed

• Items must be procured
If a system must be recovered at a different location, specific items related to that service need to be transferred or obtained. Recovery procedures should delegate a team to manage shipment of equipment, data, and vital records. Procedures should explain requirements to package, transport, and purchase materials required to recover the system.

3.2.2 Recovery Procedures
The disaster recovery plan should provide detailed procedures to restore the system or system components. Procedures for IT service damage should address specific actions such as:

• Get authorization to access damaged premises or geographic area

• Notify users associated with the system

• Obtain required office supplies and work space

• Obtain and install required hardware components

• Obtain and load backup media

• Restore critical operating systems and application software

• Restore system data

• Test system functionality including security controls

• Connect system to network or other external systems
To avoid confusion in an emergency situation, the recovery procedures should be documented in a simple step-by-step format, without assuming or omitting any procedural steps.

3.3 Reconstitution Phase
In the reconstitution phase, operations are transferred back to the original facility once it is free from the disaster aftereffects, and execution-phase activities are subsequently shut down. If the original system or facility is unrecoverable, this phase also involves rebuilding. Hence the reconstitution phase may last for a few days to few weeks or even months, depending on the severity of destruction and the site's fitness for restoration. As soon as the facility, whether repaired or replaced, is able to support its normal operations, the services may be moved back. The execution team should continue to be engaged until the restoration and testing are complete.
The following major activities occur in this phase:

• Continuously monitor the site or facility's fitness for reoccupation

• Verify that the site is free from aftereffects of the disaster and that there are no further threats

• Ensure that all needed infrastructure services, such as power, water, telecommunications, security, environmental controls, office equipment, and supplies, are operational

• Install system hardware, software, and firmware

• Establish connectivity between internal and external systems

• Test system operations to ensure full functionality

• Shut down the contingency system

• Terminate contingency operations

• Secure, remove, and relocate all sensitive materials at the contingency site

• Arrange for operations staff to return to the original facility

4 The Disaster Recovery Plan Document
The outcome of the disaster recovery planning process is the disaster recovery plan document. During an emergency, this document will be the primary source of information for disaster recovery procedures.

4.1 Document Contents
The disaster recovery plan document is the only reliable source of information for the disaster recovery during an emergency. It should be very easily readable, with simple and detailed instructions. Following are some of the contents that need to be in this document.

• Document Information: The document should include information such as the authors/owners with their contact details, revision history and other document details (name, location, version), references, and the audience of the document. In the document revision history, it is good to have a brief description of the changes made in each version. A table of contents is a must for quick reference, and it is highly recommended that the sections be numbered to the lowest possible level for easy reference purpose. It is also good to give an appropriate confidential status for the document as it contains sensitive information.

• Purpose: The purpose of the document must be clearly stated in the introduction, defining the objectives the plan intends to achieve.

• Scope: The scope of the plan defines the circumstances under which the plan is invoked and the length of time the procedures defined in the document are in effect. The different failure conditions that lead to invoking the plan should be clearly listed. For example, a system being down for couple of hours may not result in invoking the plan, but a daylong outage may suffice. Similarly, the conditions at the failed system/facility that warrant the reconstitution phase should also be clearly stated.

• Assumptions: Any conditions the plan assumes to be present for success should be clearly stated. This may involve listing the dependencies of the plan as well. For example, a certain number of trained personnel may be assumed to be available at the disaster recovery facility. Wherever possible, these dependencies must be accompanied with the appropriate contact details.

• Exclusions: Any related disaster activities that the plan does not cover should be stated and any known references mentioned here. For example, the plan may exclude the dependent power restoration plan, referring instead to the appropriate document and the department contact details. Such information will be useful during the disaster recovery.

• System Description: The description of the disaster recovery system should be simple to understand with appropriate figures, workflow charts, and so on. If necessary the descriptions may reference appendices that give more detail. The functions that need to be revived need to be clearly mentioned.

• Roles and Responsibilities: The roles of the managerial and technical staff and their responsibilities during the activation, execution, and reconstitution phases should be clearly listed. An organization structure diagram showing the reporting relationships is beneficial. Key roles should have primary and alternate personnel assigned.

• Contact Details: Full contact information should be included for all the managerial and technical staff involved in the planning, activation, execution, and reconstitution phases. Contact details both during normal situations and emergency situations should be mentioned. This information is recommended to be added as an appendix to the disaster recovery plan document.

• Activation Procedures: The procedures for notification, damage assessment, and activation planning should be outlined. Any topic that needs to be covered in great detail may be added as an appendix.

• Execution Procedures: The recovery procedure for each of the components the plan covers should be explained step by step in detail. When there are parallel threads of tasks, it is beneficial to have a flow chart diagram to visualize the dependencies of the tasks. The success and failure criteria of each procedure also should be mentioned as well as instructions on further actions in case of both success and failure.

• Reconstitution Procedures: Similar procedures for the reconstitution of the components should be explained in detail. The success and failure criteria and instructions for further actions in case of success and failure should be given.

4.2 Document Maintenance
The disaster recovery plan document needs to be kept up to date with the current organization environment. A plan that is not updated and tested is as bad as not having a plan at all because during emergencies, the document may be misleading. The following are recommended for maintenance of the plan documentation.

• Periodic Mock Drills: The disaster recovery plan should be tested from time to time using scheduled mock drills. A drill usually will not affect active operations; however, if it is known that operations will be affected, the drill should be carefully scheduled such that the effect is minimal and is done during a permissible window. These activities should be regarded similarly to regular equipment maintenance activities that require operations downtime. The experience of the mock drill should be updated into the disaster recovery plan document.

• Experience Capture: The best testing the document will undergo is when an actual disaster happens, and the lessons learned during the disaster recovery are valuable for improving the plan. Hence the Disaster Recovery Committee should ensure that the experience gets captured as lessons learned and the document gets updated accordingly.

• Periodic Updates: Technologies, systems, and facilities that the plan covers may change over time. It is important that the disaster recovery plan document reflect the current information about the components it covers. For this purpose, the Disaster Recovery Committee should ensure that the document is audited periodically (say once every quarter) against the present components in the organization. Another way to achieve this is to ensure that the committee is notified of any change that happens to any system/component in the organization so that the committee may update the document accordingly.

5 Reference
Contingency Planning Guide for Information Technology Systems: Recommendations of the National Institute of Standards and Technology, byMarianne Swanson, Amy Wohl, Lucinda Pope, Tim Grance, Joan Hash, and Ray Thomas. NIST Special Publication 800-34; available at: http://csrc.nist.gov/publications/nistpubs/800-34/sp800-34.pdf. Information For

Small Business Midsize Business Service Provider Executives Home (Linksys)

Industries
Contacts

Contact Cisco Find a Partner

News & Alerts

Newsroom Blogs Field Notices Security Advisories

Technology Trends

Cloud IPv6 Mobility Open Network Environment Trustworthy Systems

Support

Downloads Documentation

Communities

Developer Network Learning Network Support Community

Video Portal
About Cisco

Investor Relations Corporate Social Responsibility Environmental Sustainability Tomorrow Starts Here Career Opportunities

Programs

Cisco Designated VIP Program Cisco Powered Financing Options

Contacts | Leave FeedbackFeedback | Help | Site Map |
Terms & Conditions | Privacy Statement | Cookie Policy | Trademarks

Similar Documents

Premium Essay

Cloud Database Management System

...Cloud Database Management System IS508E group NO.6 project Group NO.6 members: PENG Yu KALAI Kumaraguru KUTTIKKAT VENUGOPAL Sreehari Contents General business case ............................................................................................................... 1 Introduction and problems ....................................................................................................... 1 The existing technology ............................................................................................................ 2 Challenges of Implementation: ................................................................................................. 9 Reference: ............................................................................................................................... 10 General business case Thanks to the successful management, the business of the company expands very fast. However, the database management system cannot withstand the quickly and greatly increased work load, the break-down frequency increases, which make the decision-making efficiency and customer experience drop. The company tries to turn around this bad trend and poses two solutions: one is to update the present DBMS including hardware, software and human resource; the other one is to make use of SQL Azure from Microsoft to set up cloud environment and transfer the DBMS there. After cost analysis, we find that to reach the same efficiency standard, the cost......

Words: 3594 - Pages: 15

Premium Essay

Cloud Computing

...ABSTRACT The rise of cloud computing and cloud data stores has been a precursor and facilitator to the emergence of big data. A database accessible to clients from the cloud and delivered to users on demand via then Internet from a cloud database provider’s servers. Also referred to as Database-as-a-Service (DBaaS), cloud database can use cloud computing to achieve optimized scaling, high availability, multi-tenancy and effective resource allocation. While a cloud database can be traditional database such as MySQL or SQL Server database that has been adopted for cloud use, a native cloud database need to be set up to be better equipped to optimally use cloud resources and to guarantee scalability. The combination of cloud computing and a new generation of big data analytics DBMS is enabling big database demand. Cloud database offers significant advantages over traditional counterparts such as increased scalability, automatic failover and fast automated recovery from failures, minimal investment and maintenance of in-house hardware, and potentially better performance. But at the same time cloud databases have its own potential drawbacks including security and privacy issues along with potential loss of vital information or inability to access critical data or information in the event of disaster of cloud database service providers. No matter what kind of cloud computing services operators plan to offer they need to have an effective cloud database strategy in place or......

Words: 346 - Pages: 2

Premium Essay

Outage and Downtime

...who use Sabre global distribution system, topped the list of worst airlines in the US (Tooley, 2015). Background Sabre is one the leading provider of global distribution system to the travel and tourism industry. The Sabre GDS enables companies such as American Airlines, BCD Travel, Expedia, JetBlue, and Travelocity to search, price, book, and ticket travel services provided by airlines, hotels, car rental companies, rail providers and tour operators worldwide. It headquarters in Texas and employs over 10,000 employees in 60 countries with revenues of approximately $3 billion. Sabre service is using worldwide by 400 airlines in more than 700 airports, by more than 125,000 hotel properties, 27 car rental brands, 50 rail providers, 16 cruise lines and other global travel suppliers around the world generating more than $5.85 billion in revenue each year for its customers. More than 600 million people make purchases through that system annually (Sabre Authors, 2014). Sabre offers the industry’s broadest range of technology solutions including, data-driven business intelligence, mobile, distribution and Software-as-a-Service solutions, used by travel suppliers and buyers to plan, market, sell, serve and operate their business. It processes over 1.5 billion API data requests daily (100,000 messages per second), which puts them in a group of companies such as...

Words: 1509 - Pages: 7

Free Essay

Cloud Computing

...18, 2015 Cloud Computing Cloud computing is a cost effective and on-demand way to access data from servers, storages, databases, and applications over the internet as a service. The uses from cloud computing includes not only software as a service (SaaS), but also platform as a service (PaaS), and infrastructure as a service (IaaS). Cloud Computing has many advantages for everyday users of the internet through applications, such as Facebook or Google drive, to more intricate uses for businesses, such as outsourcing data and business applications. Cloud computing allows online applications to let their users access the application without the need of having to install, update, or manage the application. Also businesses use cloud computing as a way for cost- savings and more efficient processing by outsourcing their huge collections of data utilizing complex information systems. The main advantages of cloud computing are that it allows for easy storage and scalability, better processing, cost efficiency, backup recovery, and mobility. Companies dodge not having to create their own infrastructure of servers, not having to update and manage software or applications, not having to hire and train staff, and would even have decreased onsite energy cost. The powerful servers contained by the third party also allows for a more efficient processing, which is easier for the third party to do because their company is dedicated to effective storage of data. Also most cloud providers...

Words: 490 - Pages: 2

Premium Essay

How Cloud Computing Impacts Trade

...• Cognizant 20-20 Insights How Cloud Computing Impacts Trade Finance Executive Summary As worldwide trade gradually recovers, the financial climate enablers remain challenging. Affordability, accessibility and adherence to newer, stricter Basel regulations stand as unabated hurdles in the path to rapid recovery of trade finance.1 According to a survey conducted by International Chamber of Commerce (ICC) in 2010, a total of about 42.9 million transactions were registered, representing a 5.81% increase over 2009 — a slight gain after the previous year’s fall. Technological innovations are bound to play a crucial role in accelerating the recovery process through the streamlining of front-end to back-end processes, enabling trade finance institutions to offer customized, low cost, value-added solutions that meet the requirements of geographically diverse customer segments. This paper discusses a key technological advancement, cloud computing, which is already making inroads at leading trade finance software players. This development enables a bank to build a strong trade finance architecture for maximizing profitability, a goal which starts with making such services more affordable and accessible to customers. Within trade finance and other corporate transaction banking services, financial institutions are moving ahead to reap the benefits from lower-cost private cloud services. These cloud services offer dedicated solutions with rigorous security controls, freeing both......

Words: 3592 - Pages: 15

Premium Essay

Cloud Computing

...IIIT, Hyderabad Cloud Computing for E-Governance A white paper Abstract The worldwide revolution in Internet is changing our lives in terms of the way we work, learn and interact. These changes naturally should reflect the way government functions in terms of the organization of the government, its relationship with its citizens, institutions and businesses and cooperation with other governments. Also, the increasing generalization of technology access by citizen and organizations brings expectations and demands on government. At the same time, governments are also proactive in this domain and are planning new ways of interacting, improving services, optimizing processes and revitalizing democracy by spending amount on IT. It aims to deliver more interactive services to citizens and businesses through E-Governance. For this, cloud computing may lead to significant cost savings. It entails use over the Internet of computing hardware and software infrastructure and applications that are remotely hosted. In this white paper, we describe how this newly emerged paradigm of cloud computing can be helpful for E-Governance. IIIT, Hyderabad January 2010 CLOUD COMPUTING FOR E-GOVERNANCE January 1, 2010 Table of Contents Executive Summary....................................................................................................................................... 3 1. E-Governance Requirements .......................................................................

Words: 4341 - Pages: 18

Premium Essay

Distributed System

... Cloud computing is the latest evolution of Internet-Based Computing. Public internet spawned private corporate intranets, cloud computing is now spawning private cloud platforms. The database is the critical part of that platform. Therefore it is imperative that our cloud database be compatible with cloud computing. Key Design principles of the cloud model: The core design principle is dynamic scalability, or the ability to provision and decommission servers on demand. The shared-disk database architecture is ideally suited to cloud computing. It requires fewer and lower cost servers, it provides high availability, reduces maintenance costs by eliminating partitioning and it delivers dynamic scalability. Benefits of Cloud Computing: a. Lower Costs: All resources are shared resulting in reduced costs. b. Shifting CapEx to OpEx: This enables customer to focus on adding value in their areas of competence. It allows customer to focus their money and resources on innovating. c. Agility d. Dynamic Scalability: It can smoothly and efficiently scale to the spikes with a more cost-effective pay-as-you-go model. e. Simplified maintenance: All Patches and upgrades are deployed across the shared infrastructure. f. Large scale prototyping/load testing g. Diverse platform support h. Faster Management approval i. Faster development With corporate adoption of cloud computing there are explosion of cloud options....

Words: 3040 - Pages: 13

Premium Essay

Mis 535

...Course Project Proposal The Implementation of Cloud computing and existing Database System in Conventional Power plants Rahul Shah (rahul09oct@gmail.com) MIS-535 Prof. Nichelle Manuel Table of Contents Abstract 3 Company Background 3 Business Problems 3 High-Level Solution 4 Smart grid and Cloud computing 5 SCADA and Cloud computing 6 Benefits of solving the problem 7 Business/ Technical Approach 9 Cloud computing Infrastructure 10 Integration of all plant and customer activities in one database 10 Provision of Internet Protocol Security 11 Moving Smart grids and SCADA to cloud 11 High-level Implementation Plan 11 Set up a program architecture that considers risk and industry maturity 12 Usage of technology for a long term 12 Maintain significant focus on IT integration activities ...

Words: 3880 - Pages: 16

Premium Essay

Cloud Computing

...CLOUD COMPUTING Saas and Paas Cloud computing is a significant alternative in today’s educational perspective. The technology gives the students and teachers the opportunity to quickly access various application platforms and resources through the web pages on-demand. N.NGQAYANA 2851522 9/27/2012   TABLE OF CONTENT Abstract…………………………………………………………………..3 Introduction…………………………………………………………..3 Overview………………………………………………………………3 Universities Implement Cloud Computing……………..............4 Software as a Service (SaaS)……………………………………...6 Platform as a Service (PaaS)………………………………………6 IBM cloud computing……………………………………………….7 Advantages…………………………………………………….8 Disadvantage: Unused resources……………………………………..9 Disadvantage: Interoperability issues………………………………..9 Challenges of cloud computing…………………………………………………………10 Security concerns………………………………………………………..10.1 Disaster recovery………………………………………………………..10.2 Data protection……………………………………………………………10.3 Examples and Experiences OF Universities and IBM …………11 Why Cloud Computing is Important for Business……………..11 Conclusion……………………………………………………………..13 References……………………………………………………………..14 Abstract Cloud computing is a significant alternative in today’s educational perspective. The technology gives the students and teachers the opportunity to quickly access various application platforms and resources through the web pages on-demand. Unfortunately, not all educational institutions......

Words: 4394 - Pages: 18

Premium Essay

Ais Auditing Paper

...Essay 1. There are two different auditing functions internal auditing and external auditing. Internal auditors are a company’s own accounting employees that perform the audit. On the other hand, external auditors are from outside of the company and work for an independent CPA firm that performs an external audit. Internal auditors report to top management positions such as the Audit Committee of the Board of Directors. The internal auditing function involves five main evaluations. 1) Employee compliance with organizational policies and procedures, meaning that employees are not breaking or violating the company’s rules. 2) Effectiveness of operations, meaning that the company’s controls and production are operating as efficiently as possible. 3) Compliance with external laws and regulations, meaning that the company’s procedures and operations do not violate any governmental or business laws. 4) Reliability of financial reports, meaning that the financial reports are not biased or construed in a way that would cause misrepresentation. 5) Internal controls, this means that the company is protected (as well as possible) against fraud, theft, and corruption. Overall, the internal audit function checks the efficiency and integrity of almost the entire company. The internal audit benefits the company’s management and employees to check and ensure that company procedures are efficient and legal. The company would rather have a mistake or fraudulent information be caught by the......

Words: 1958 - Pages: 8

Premium Essay

Cis 500 Case Study - Cloud Computer

...for Decision Making Abstract Cloud computing is not strange to people at all. People use their iPhone and their iPad App to store their important files. Microsoft 2013 users use the SkyDrive to store their files and they can access anywhere and anytime. Major venders such as Google, Amazon, and Microsoft have provided Cloud Computing services. This paper discusses the Amazon Web Services (AWS) and evaluates the scalability, dependability, manageability, and adaptability of Amazon Elastic Compute Cloud, Amazon Simple Storage Service, and RightScale. Moreover, this paper examines the security concerns for cloud-based services and assesses scalability, reliability, and cost issues. Assess how Ericsson benefitted from Amazon Web Service (AWS) in terms of cost reduction, automated software updates, remote access, and on-demand availability Ericsson is one of the world’s leading providers of technology and services to telecom operators. There are reasons how Ericsson success like that. According to the Amazon Web Services (AWS) Case study, Ericsson uses AWS such as Amazon Elastic Compute Cloud (Amazon EC2), Amazon Simple Storage Service (Amazon S3), and RightScale for provisioning and auto-scale functionality. AWS give many benefits to Ericsson. For example, Ericsson saves money to invest to build on-premises infrastructure by using AWS. Ericsson also saves the time to build and install the infrastructure. Some of the technical benefits of cloud computing include......

Words: 1257 - Pages: 6

Premium Essay

High Level and Detailed Cloud Roadmap

...4) Manage the Migration - Pre-Migration Planning -Migrate to Cloud Based Infrastructure -Decommission Legacy systems - Optimize for the Cloud - Maintain and Improve 5) Formalize Architecture Review Board process - Profile IT Systems -Review Business Impact Analysis -Formalize Future State Architecture 6) Manage Vendor Selection / Contracting -Develop Detailed requirements - Release RFPs and evaluate CSPs -Selects CSPs and Establish Contracts 7) Manage Enterprise Cloud Migration and Modernization -optimize re-architect systems for the cloud - monitor performance and service levels - Use cloud Maturity Model to evaluate Organizations and Improve Manage the Migration Key Activities Determine authority, scope and goals of the ARB (or review and update if already established) Formalize and document processes and procedures for the HUD Architecture Review Board (ARB) Establish governance structure to review modernization projects and determine alignment with cloud migration efforts and alignment with business impacts/needs Oversee vendor selection process and contract establishment Oversee cloud migration activities • Re-prioritize initiatives based on business needs, IT constraints, and security level • Review cost / benefit analysis for cloud migration initiatives • Review initiative specific migration plans • Oversee initiative migration activities • Monitor cloud service provider to ensure compliance with SLAs Pre Migration......

Words: 1075 - Pages: 5

Free Essay

Comparing Dbms

...PT2520: Database Concepts Week 4 Essay: Comparing DBMS ITT Technical Institute – Westminster Walter Gonzales 7/12/15 Comparing DBMS What is database management system (DBMS)? Database management system is reliable mean to organize date into a single location that can be searched and updated at any time. By adding all the information in one location or in this case a server you are storing the information for later use. You can than search any information you require and update it or even remove old items from the database. Today there are several different options or providers for a DBMS server. Among the top providers you have Oracle, MySQL, IBM DB, MS SQL Server. Oracle 12C Oracle 12C is a cloud base database management system. It supports the following operating systems (OS) Microsoft Windows, Linux, Oracle Solaris and some Unix. It is a web-based interface that stores all the procedures on the Oracle Management Repository. The most recent update was released on September 30, 2014. Some of the benefits of Oracle 12c are the following. Per oracle documentation “It provided a secure multitenant application by adding a layer of abstraction or containerization. It allows you to use the cloud providers for an easy and quick mean to allocate and manage the database across multiple systems and data centers without changing the application”. It also provides a disaster recovery, backup, patching, cloning and upgrading flexibility. Oracle is able to replace or move any......

Words: 926 - Pages: 4

Premium Essay

5 Effects

...and adjust to consumers for direct advertising efforts. 2. Cloud Computing (I-cloud/Verizon Cloud) – cloud computing affects the way that a company may store its data, software integration, and back-up and recovery. Clouds allow data storage and back and recovery in a cost efficient manner. Businesses are outsourcing data storage to third parties using cloud technology. Since cloud technology is still fairly new there are some security concerns that will be addressed and corrected overtime. 3. Consumerism – tailoring technology for specific needs through third party apps. This allows business services to available to consumers 24 hours seven days a week. Apps also provide a vehicle for advertisement on a constant basis. Apps will continue to improve in their efficiency and ability to provide service to consumers outside of normal business hours. 4. Mobile Devices (Smartphones/Tablets) – allows users and employees unrestricted access and to business systems. Allows employees to check and send email from anywhere and browse the internet. Mobile Devices also allows the installation of apps to improve business systems. Tablets are replacing PCs because of their portability and capabilities. Tablets are able to operate without using a modem or router making them more assessable than PCs. Tablets and other Smart Devices will eventually phase out the PC just as PCs phased out desktop computers. 5. Big Data – (Database, Data warehouse, and Data mining) – A lot companies have......

Words: 294 - Pages: 2

Premium Essay

Network Development Project

...Desk IT 11 Office Manager 11 Financial manager 11 Supervisors 12 Receptionist 12 Performance Measures and Reporting 12 Printers 12 Phones 13 Work stations and laptops 13 Serves 13 Routers and Switches 13 Software 14 Card Access System 14 Governance and Management/Security Approach 14 Customer/Business Owner Management and security 14 Standard Operations and Business Practices 14 Security 14 Data Sharing 15 Data Storage 16 Tools used for change control management 16 Problem reporting 16 Risk identification 16 Disaster Recovery 16 Documentation Strategies 16 Training 16 Security 17 Roles and Responsibilities 17 Network 19 Acceptance 20 Training Plan 20 Introduction 20 Scope 20 Training Approach 21 Curriculum 22 Evaluation 23 Testing Document 24 Test Set 1: Fault Tolerance 24 Test 1: Basic Failover 24 Test Set 2: Recovery 25 Test 2.1: Manual Recovery to a Second Machine 25 Test Set 3: Exception Handling 26 Test 3.1 Out-of-Order Startup Sequence 26 Test 3.2 Test Death of Naming...

Words: 11047 - Pages: 45