How, and Why, We Applied Machine Learning to Cove Continuity, Part 2

If you haven’t already read part 1, click here to do that first.
If you look at the screenshots, they’re actually quite simple to understand. Anyone can easily identify whether the OS booted successfully at first glance. Look at the following examples and you’ll see what I mean :
So, rather than the existing deterministic method, which relied on indirect evidence, we opted to use machine learning and neural networks to analyze and classify screenshots like a human being.
Only a few years ago, this might have sounded like science fiction. Today, there are a variety of mature AI/ML tools and technologies ready for practical application. Better still, many of them are open-source and publicly available. So, our job was to look for proper tooling and use it accordingly.
We knew what to do, however we had little experience in the area. So, we decided to start with a POC. We evaluated a number of neural networks that can be used to classify images, and opted to use SqueezeNet—a well-known open-source model that proved its efficiency in numerous contests.
Preparing the dataset and training the model
Roughly speaking, the SqueezeNet model is a kind of an architecture or algorithm that can solve a general problem. In this case classifying images. However, with some training we can make it solve our specific problem—classify VM screenshots.
To train the model, we first must provide it with a proper dataset. This allows the model to “know” what we are trying to classify and what we expect as a result. Since Recovery Testing has been in production for several years, we had tons of VM screenshots to use for training. To prepare the dataset, we removed all PII and manually labeled 2,500 screenshots with a specific class, e.g., failed to boot, successfully booted, or loading.
We used PyTorch to conduct the training. It took us around 60 minutes on the above dataset. And as a result of training, we got a 5MB file with parameters. Pretty dense and will not eat a lot of RAM. Nice!
As an output, we received model weights, or parameters. If we apply these parameters to the model, it should be able to solve the problem on any set of screenshots, even if they were not part of the initial training set.
Evaluate results and tune parameters
Theoretically, we could have applied that model in production, but before doing that we had to validate the results and, if needed, tune the model parameters. In the first stage, we checked the results of the model training on 1,000 screenshots that were not part of the training set. In the second stage, we checked it on another 36k screens.
The results were tremendous. It takes 1 second (max) on a decent developer’s workstation with Intel i7 to classify a screenshot. More importantly, we now have a mechanism to classify screenshots with an accuracy of 99%. Not bad!
A summary in numbers:
- 2.5k screenshots used for training
- 37k screenshots used for verification
- 5MB size of the file containing model parameters
- 1 second max time needed to classify a single screenshot
- 99% accuracy of the classification
Sergey Shaminko is Cove Engineering Manager at N‑able
To learn more about how Cove keeps your customers’ data safe, don’t hesitate to schedule a call with us!
If you are interested in learning more about Cove’s approach to cyber resilience, please don’t hesitate to schedule a demo.
To FIND OUT MORE about Cove Data Protection visit www.n-able.com/products/cove-data-protection Or simply start a FREE TRIAL at www.n-able.com/products/cove-data-protection/trial
© N‑able Solutions ULC and N‑able Technologies Ltd. All rights reserved.
This document is provided for informational purposes only and should not be relied upon as legal advice. N‑able makes no warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information contained herein.
The N-ABLE, N-CENTRAL, and other N‑able trademarks and logos are the exclusive property of N‑able Solutions ULC and N‑able Technologies Ltd. and may be common law marks, are registered, or are pending registration with the U.S. Patent and Trademark Office and with other countries. All other trademarks mentioned herein are used for identification purposes only and are trademarks (and may be registered trademarks) of their respective companies.