What is ASR?
Our clients and potential future clients are often interested in the use Azure Site Recovery (ASR) to improve their current levels of disaster protection. ASR is one of the many services waiting to be consumed if and when needed, and provides the backbone of a very well rounded and capable DR solution with the following core elements:
A flexible replication/cloning engine that can:
Protect between on-premise to azure
Protect between on-premise to on-premise
Protect between Azure to Azure tenant
Fail back to the source location
Integrate well with Hyper-V, vSphere and physical hardware
Manages the post fail-over configuration of the protected workloads such as; size, storage, network, IP addressing
Failover runbooks that can:
Fail over single or groups of servers in specific orders
Integrate control of workloads and Azure assets such as; backups, storage etc using various code types
The ability to perform test failovers without impacting production services.
ASR provides the core functionality needed to enable a DR solution, ASR can be scaled out as needed, the costs incurred when DR is not invoked is relatively low, your charges will be:
ASR License: £18.63 per month per host
Storage costs on a per disk basis that can be understood using the Azure cost calculator, an example be 65Gb of storage costing around £45 per month
There may be a small amount of data egress costs
RPOs and RTOs
ASR is commonly seen achieving RPOs in the very low 10 minute range, this is highly configurable by modifying your ASR infrastructure capacity, internet access capacity and changing priority for your replications.
RTOs are very short in terms of ASR and normally between 3 to 6 minutes, however that's not the whole story, you still need to configure starting your services, ensuring your applications are running, data between two interfacing hosts with slightly different RPOs needs to be thought about. We are able to assist heavily in these areas using our custom scripting and applications.
What else do you need?
ASR should be seen as the engine of a DR solution that is supplemented with more specific Azure services to refine the invocation of DR processes. We're talking about the finer details such as:
Adding your protected workloads into backup for ongoing running in DR mode
Assigning pre-defined public IP addresses, NAT and WAF solutions
Modifying public DNS records for incoming connectivity
Working with external interfaces and your changing public IP addresses
Preventing accidental data runaway through your interfaces immediately post fail over
Ensuring your on-premise and internet based end use compute can access for failed over environment
There's no place like home
Once you've successfully failed over into Azure there's the small matter of the 'return to home' event.
In a nutshell its very similar to the fail over event, you will need to re-replicate your VMs back to on premise (hello egress costs), once replication is complete and we're back to small delta changes then you can run your fail over run books in reverse. The main differentiation being the difference between your protected VMs and the VMs that were never protected and probably recovered from shut down or backups, we'd recommend the practice of stopping your services for enough time to allow your application and information teams to understand the full picture and to re start applications in a (very) controlled manner.
The business stuff
It's important to know what workloads need protecting by your ASR driven DR solution as resources are finite and priorities must be assigned to your efforts and budget, make sure your approach is driven top down from the business by following your business continuity plan (BCP) and speak with your customers to understand their needs when running in disaster mode. You will need your customers to provide feedback that will ultimately help you protect the correct services and potentially provide some form of UAT support later.
Business buy in to any project is important and especially so with disaster recovery, pressured situation can cause unpredictably behavior so having the finer detail locked down and agreed (such as what applications and data) prior to testing and live DR running is important, with expectations clearly set then IT departments can work more effectively with their stakeholders and any challenges that DR events present.
Not every workload can be moved into Azure using ASR without a bit of work
Well most operating systems can.... check the support matrix: https://docs.microsoft.com/en-us/azure/site-recovery/vmware-physical-azure-support-matrix
What we mean is that many workloads will need some work before then can work on a different network (in Azure or on-premise), keep speaking with their interface endpoints and still allow users to access their client applications. We typically undertake a thorough audit of the clients environment and will focus mainly (but not limited to) on:
Interfacing and ensuring they will continue to function on a new network IP range with new firewall rules
Application delivery and will it continue from a cloud location, is any support needed such as a remote desktop solution or web farm
Any post fail over processes such as altering AD sites and services, stopping or starting certain services
File locations are interesting, we don't tend to replicate large file services, DR can be achieved through using Azure file sync or migrating file to SharePoint online. Interface UNC file drop locations should be moved to a dedicated file server that can be protected and moved with the appropriate ASR fail over run book
Typically the client is heavily engaged with this piece of work, this is actually beneficial as it often serves as a configuration cleansing activity and normally involves implementing configuration best practice.
Application delivery is critical, when located in Azure you're applications have longer to travel, the choice is either to move your application delivery workloads from on-premise to Azure of leverage new solutions such as VWD and FlexLogic, note this may be favorable and form a useful introduction to Azure based desktop delivery to replace your on-premise solutions in a future production manner.
We'd recommend replicating your FSMO roles (that's not something you often hear), but without your FSMO roles in Azure various services will be effected by the loss of the PDC Emulator for account activities, time synchronisation, and Domain Naming Master for name changes. Without protection of FSMO roles you'll need to perform FSMO role seizure which adds time and also complicates fail back (two servers with the same role. We'd recommend re-replicating the protected FSMO roles back to on-premise to keep things simple but obviously be careful not to re-start those DCs on-premise manually.
Less obvious benefits
An Azure IaaS migration is a long term strategy for many businesses, it can also be very daunting change, ASR provides an excellent vehicle for the familiarization process for billing, processes, technical options services. The ASR solution should be seen as a site migration tool as it can simply be used for moving workloads between your environments.
IT can use ASR to move workloads into network they've created and then deliver services back to on-premise and are essentially dipping their toes in the water of cloud services.
ASR can result in an incredible cost saving compared to second data centre. ASR can clone workloads into an isolated environment where it can be tested or reconfigured before applying any changes to your production systems, it can become a standard part of your change tool set.
In summary, ASR is an amazingly powerful tool in your arsenal, but its only part of the solution, make sure you look at the wider pieces of the jigsaw to ensure you gain maximum benefit for your disaster recovery solution.
A4S are the experts in disaster recovery and use of ASR, get in touch for help or guidance: https://a4scloud.solutions/contact