Monday, 22 May 2017

Is Your Disaster Recovery Plan & Process Up to Scratch?

In recent years, the IT landscape has been transformed by digitalisation, Cloud, social and mobile – probably with significant impact on what a business-critical system looks like and what constitutes

In recent years, the IT landscape has been transformed by digitalisation, Cloud, social and mobile – probably with significant impact on what a business-critical system looks like and what constitutes an IT disaster for your business.

So, how regularly do you review your current disaster recovery and business continuity plans? (If you truly have one) And how can you be sure your disaster recovery provision is up to scratch?

The first and most important question of any disaster recovery plan is when did you last test it?  If you haven’t tested it, then you don’t have one.

The second tranche of questions arises from the outcome of your test: how does that test compare against your RPO and RTO?  And how does the business feel about that?


Let’s have a Proper, Grown-Up Conversation

What are RPO and RTO anyway? Both RPO and RTO are measured in time.  Recovery Point Objective (RPO) is the measurement of how far back in time you are prepared to lose data. Is it yesterday evening? Is it 2 hours ago or 15 minutes ago? Recovery Time Objective (RTO) is the measurement of how long it takes to get your systems up and running again from the point disaster recovery is invoked.

These decisions have to be business-led.

They require some hard talking between IT and the business: “If we lost x hours of trading or operational data, how much would it cost us?  And y hours?  And z hours?” and “If we were out of action for x hours what would it mean?  Or y days?”

Bear in mind that the desirable RTO and RPO might not be the same for all systems and software.

How the business answers these questions will determine the disaster recovery plan and steer IT to provide appropriate solutions to match these expectations.


Architecting a Solution

Once you know the RTO and RPO that the business is willing to accept, you’ll need to conduct a risk analysis – a review of the hardware, software, data centre, cloud, communications, and premises with a view to the potential key risks.

There are a number of tools and templates that are available to help you work through this process – or you might prefer to work with external IT consultants to work through the risk analysis in relation to your business impact analysis.

In order to architect a solution that will meet your acceptable RTO and RPO, you will find it useful to talk to the providers or vendors of each solution you currently have in place to see what is available to meet your disaster recovery needs.

Once you have architected a solution, you then need to formally document the process or set of procedures by which you recover and protect your IT systems and infrastructure in the event of a disaster.

And test it. Then test it again, regularly to ensure it works to meet your expected needs.


How Cloud and Virtualisation Are Disrupting Plans

Today, most businesses are using some Cloud solutions – whether it is a SaaS application like SalesForce or a IaaS solution like Amazon Web Services (AWS) – even if they haven’t switched to a virtualised environment.

The rapid adoption of virtualised environments and cloud technologies offers some significant advantages in terms of disaster recovery.  Certainly, the ability to take what are called “snapshots” of virtualised machines means that it is possible to reduce RPO and RTO to an absolute minimum in certain risk models.

Renting space in an off-site datacentre doesn’t hold the same allure when all your data is being held and most of your applications are running in an off-site datacentre anyway.

However, this doesn’t mean that Cloud holds all the answers.  You still need to worry about disaster recovery – but your efforts will have a different focus.

Instead of being a datacentre issue, in a virtualised environment or in a business that relies heavily on cloud solutions, disaster recovery becomes more focused on communications, data management and, possibly, business continuity.


A New Set of Risks

If all your major customer data and business-critical systems are in the cloud, moving offices might be a simpler response to a geo-disaster affecting your premises or your connectivity to the internet.  Disaster recovery becomes much less a tech process of bringing systems back online, and more about the logistics of re-siting offices, people and desks.

However, Cloud also opens up a new set of risks; in terms of data, ransomware, hacking and administrative error.  It becomes even more vital that you are talking to application and service vendors about how they protect your data and applications and that you are architecting your disaster recovery solution and plan around that.

It is also worth bearing in mind that Cloud makes you much more vulnerable to a communications disaster.  You will absolutely need to be talking to your building management and communication service providers about the routes into your building and the possible alternatives to your current ISPs should a workman dig through a cable outside your office.

And test them!


Would you like more advice?  Or do you need specific help architecting, documenting or testing your own disaster recovery plans?  The Grant McGregor team can help: call us now on 0808 164 4142.


Image source: Pexels