Reducing RPO and RTO With SolarWinds Backup

It’s time for organizations to take data recovery and data backup seriously. According to Boston Computing Network, 60% of companies that lose their data close to within six months of the failure. Meanwhile, one in four companies never actually test their disaster recovery plan, leaving their operations vulnerable to unexpected incidents.

The recovery time objective (RTO) and recovery point objective (RPO) are of paramount importance to organizations worried about data triage, and trying to better match recovery requirements with backup and DR investments. Understanding your customer’s RPO and RTO will help you build the system requirements and infrastructure to recover data as efficiently as possible. In the event of a disaster, the time it takes to recover and the amount of data lost will depend on your approach to regular backups and data storage capabilities. Here’s what you need to know about RTO and RPO to effectively manage a customer’s expectations—and to restore their data within the parameters their business requires.

What is the difference between RPO and RTO?

Essentially, RPO has to do with backup frequency, while RTO relates to the recovery timeline. When there is a system outage, the RPO and RTO are two data points that can tell you how seriously the downtime has impacted a customer’s business operations:

  • Recovery Point Objective (RPO) is a measure of how frequently you take backups. If a disaster occurs between backups, can you afford to lose five minutes’ worth of data updates? Or five hours? Or a full day? RPO represents how fresh recovered data will be. In practice, the RPO indicates the amount of data (updated or created) that will be lost or need to be reentered after an outage.
  • Recovery Time Objective (RTO) is the amount of downtime a business can tolerate. In a high-frequency transaction environment, seconds of being offline can represent thousands of dollars in lost revenue, while other systems (such as HR databases) can be down for hours without adversely impacting the business. The RTO answers the question, “How long can it take for our system to recover after we were notified of a business disruption?”

Put most simply, the RPO means the frequency of backups, and the RTO dictates how long you have to recover after disaster hits. Obviously, organizations hope to have short RTOs and RPOs, but there is also a balancing act needed to determine which systems and types of data are worth larger investments to achieve those short RTOs and RPOs. Not all data is equally critical to business operations.

The shorter your RTO, the less downtime the organization must endure—minimizing productivity loss and recovery costs helping to lower the chance your organization’s reputation will take a hit. The shorter your RPO, the less data is at risk of being lost. While these two numbers are somewhat independent, they work together to help an organization develop the physical and virtual infrastructure for data recovery.

What is RTO and RPO in disaster recovery?

In disaster recovery, these numbers determine how long your organization experiences downtime, and how much data could be lost. In this important context, what is a “good” recovery point objective or recovery time objective? The answer is not straightforward. A good standard RPO/RTO depends on the type of disaster as well as the maximum tolerable period of disruption.

First, it’s important to define the potential set of disasters against which you would like to protect your organization. Some disasters that require data recovery and backup include:

  • Loss of data: This may be as simple as someone deleting a folder, or as complex as a case of ransomware or an infected database.
  • Loss of an application: This refers to when changes to security, an update, or system configurations negatively impact services.
  • Loss of a system: This includes when hardware fails, or, if you have a virtual server, when the operating system crashes.
  • Loss of business location: In this instance, a disaster might include an electrical outage, fire, flooding, or even a chemical spill outside the building. The business facilities require recovery to an alternate location.
  • Loss of operations: This is a complete stoppage of business operations—i.e., the worst-case scenario.

Each of these potential scenarios illustrates how important it is to consider your data, systems, applications, and physical location in your disaster-recovery strategy. These factors play a role in the RTO and RPO values. Once you’ve defined the particular disaster scenarios you’re hoping to protect against, you can prioritize the scenarios your customer is most interested in preventing, then implement data-protection features that match their RTO and RPO requirements.

A third figure factors into your RTO/RPO strategy: the maximum tolerable period of disruption (MTPD). This represents how long your customer is able to crisis-manage a system outage, and varies for every application and service you manage. Factors that play into this figure include tangible costs like employee wages, lost sales, weakened stock prices, and recovery expenses, as well as intangibles like reputational risk. It’s important to discuss the MTPD with your customer, and then apply that number to your RTO/RPO reduction strategy.

For example, for a given application, your customer’s maximum period of toleration might be two hours. That means your recovery time objective must equal less than two hours, and your data must be backed up less than every two hours to meet the ideal RPO. This gives you the guideline you need to create a physical and virtual system that meets the needs of your customer in the event of a disaster.

If your customer isn’t sure what their maximum tolerable period of disruption is, there are a few key questions that can help them set better expectations. Ask these questions to understand a customer’s RTO and RPO on a more granular level.

  • How often does this type of data change?
  • What does each minute of downtime for this service cost, either in lost revenue or lost productivity?
  • Could you transact business with pencil and paper, if necessary, while this service is down?
  • If you are experiencing downtime, how does it impact your customers?

Going through these questions with your customer can help you work backward to what you need to back up and how this data needs to be backed up to minimize risk in a disaster scenario.

What is RTO and RPO in SQL Server? 

SQL Server is a Microsoft-specific relational database management system that stores and retrieves data as requested by other applications. The server allows users to set up automated log backups to be restored from a standby server. With this log shipping, users can recover a fairly recent database copy—depending on the RTO and RPO of that process. Those RTO and RPO requirements are set by users, depending on their needs, budget, and any technological network limitations.

However, SQL Server RTO and RPO are not necessarily straightforward. In many cases, the process isn’t as fast as a client may imagine. They may have an ideal RPO in mind, but slow network speeds or an incorrectly configured backup can throttle this process. In addition, restoring a log backup in this way can involve transferring large amounts of data, and this process can easily exceed the determined acceptable RTO.

Reducing an organization’s RTO and RPO 

It’s important to consider the RTO and RPO as they apply to different aspects of an organization’s data. Organizations that do a file-level backup of that database, rather than investing in an offsite virtual environment, will see longer recovery times and limits to how recently updated that data will be once recovered.

Consider the possible disasters, match them with the data sets that need to be protected, and then identify the recovery objectives. These steps will then provide you the information necessary to build tactical backup solutions that meet your recovery time objective and recovery point objective.

SolarWinds® Backup is designed to reduce RTOs and RPOs from hours to minutes. The system allows an organization to customize the data and assets they wish to protect and manage them all from a single dashboard. Organizations can recover physical and virtual servers, workstations, business documents, logs, and Office 365 data. For more serious threats—a downed server or a downed site, for instance—SolarWinds’ cloud-first backup solution can draw data from both local storage and the cloud to reduce the fallout.

Two features of SolarWinds Backup that stand out are the ability to do continuous restore, which creates a standby image, and the backup accelerator. Standby image gives organizations a more flexible and streamlined approach to recovery. This feature can give you an RTO of less than 5 minutes. Meanwhile, the backup accelerator cuts your RPO by continuously monitoring large files for changes, lowering backup preprocessing times, and reducing the overall time needed to complete backups.

 

Exceed your RTO and RPO times by learning more about SolarWinds Backup today.