From a very young age we are conditioned to the duality of things like pass/fail, stop/go, red/green, and on/off with little room in between. When it comes to monitoring backups, that pass/fail approach may seem ideal if you have just a single screenful of devices under management. However, once you’ve grown to hundreds, or even thousands, of devices the accuracy of a pass/fail result can be a bit ambiguous.
The reality is there are many more factors worth considering before you simply declare a backup job a success or flag it as a failure to be logged into a ticketing system. Assign too broad a definition of success and you risk overlooking devices that have real issues, while too narrow a definition allows for the potential to lose critical issues among a sea of informational ones.
This is the first blog in a multi-part blog series focused on how to define and implement effective monitoring of the gray areas between pass and fail with SolarWinds® Backup.
Seeing all your configured backup devices is the default dashboard view when you initially launch SolarWinds Backup. You can click the column headers to sort by “Customer,” “Backup status,” “Errors,” etc., and you can also change the number and density of devices viewed on the screen. However, when you get to more than a single screen of devices, monitoring your success at a glance becomes problematic. One option is to click the “Export” button and review the device statistics offline in Microsoft Excel, but you really need to start to consider how you will identify and manage devices by exception.
Expanding dashboards from the left vertical menu presents a set of predefined dashboard views. To display only unsuccessful backups, you would specifically select the second dashboard view entry, “Unsuccessful backups,” to filter out systems that last report as “No backup,” “In process,” “Completed,” or “Completed with errors.” Each of these predefined views adds specific columns and filters to the dashboard to display a unique set of statistics to identify unsuccessful or outdated backups, cloud or physical assets, devices leveraging disaster recovery, or other advanced features.
Unsuccessful backups and completed with errors
Since we do not have a predefined view that includes “Unsuccessful backups” and “Completed with errors,” we will simply edit the existing filter set by the prior discussed “Unsuccessful backups” view. Look below the Toolbar for the “Select from Backup status” filter, click the drop-down arrow, and check criteria “Completed with errors.” While the “Unsuccessful backups” view included multiple status filters that you may typically consider unsuccessful—such as “Failed,” “Aborted,” “Interrupted,” “Not started,” etc.—it did not include “Completed with errors” since the actual number or type of error might easily shift it back and forth between what would pass for a “Completed” or “Failed” status.
Unsuccessful backups more than 48 hours ago on server operating systems
Assuming server protection is more important to you than workstations, you can quickly view systems that have been unsuccessful over 48 hours by using the three interactive doughnut widgets located at the top of the console. Viewed from left to right they provide visibility into your managed assets, last protection status, and current compliance status. Start with the “All devices” view and then just click the widget colors to filter “Servers, Unsuccessful and More than 48 hours ago.” This gives you immediate access to a condensed list of systems that are likely worthy of further review.
Unsuccessful backups more than 48 hours ago on server operating systems with Microsoft SQL
Maybe this is still too many devices to interact with or assign to a technician to resolve. If this is the case, one approach is to use additional filters to further divide the devices by “Active data sources” for example systems with Microsoft SQL configured. This can be done easily by clicking the “Select filter” drop down and checking “Active data sources” and then selecting “MS SQL VSS” or other data sources. There are well over 300 filters and columns available to choose from when drilling deeper into device statistics.
Customizing and saving views
Building even more elaborate views and filters is possible and they can be saved for later use. As an example, we can build upon the prior view and use the steps above to customize it to include devices that “Completed with errors.” Then you can use the “Columns” button to replace “OS type” with “OS version.” Next, scroll your columns to the right and drag “OS version” to the left of column “Data sources.” Now you will want to click “Save view” and name this view prior to clicking “Save as new.” Finally, go ahead and check—your new view should be visible in the drop-down lists, on the left side “Dashboard views” and right side “Save views.”
Daily notification of unsuccessful backups
Assuming you want to be notified periodically with a list of devices that meet your saved view criteria for an “Unsuccessful backup,” then emailed views are the way to go. Start by clicking the “Save view” drop-down and selecting “Email view.” Now “Add schedule,” select or add email recipients, select the appropriate dashboard view, set a schedule and “Save” or click “Send it now” to get a live sample in your inbox.
All the search filters discussed in this post use the filter “Normal search” mode. These interactive filters are displayed just below the doughnut widgets and give a tremendous amount of customization to your dashboard views. However, this is just the tip of the iceberg when it comes to managing backup by exception because filters also include an “Advanced search” query language that lets you build even more specific filters based on statistics like date, time, duration, size, error count, etc. This makes it possible to look at things like performance, usage, deduplication ratios, and under protected devices.
Defining and constructing these advanced filters will be the focus of several of the next blog posts.
Eric Harless is the head backup nerd at SolarWinds MSP. Eric has worked with SolarWinds Backup since 2013 and has over 25+ years of data protection industry experience in sales, support, marketing, systems engineering and product management.
You can follow Eric on Twitter at @backup_nerd