SolarWinds MSP is becoming N-able

Read more

What are your options when a patch goes wrong?

We all know patching is important right? Patches fix bugs, security holes and add additional features. Keeping systems secure should be your number one concern but you have to strike the right balance between that and stability, which can be a difficult one to pull off. Generally speaking, patches will install fine and without any issues because vendors do a lot of work in the background to ensure their patches pass high-quality control tests.

Sometimes however things can go horribly wrong.

Despite vendors’ best efforts patches can cause problems

There have been a number of problematic patches over the past few years. Back in 2014 several patches caused Windows 8.1 to reboot with the “Blue Screen of Death”. In late 2015 an update caused Window Server 2012 R2 to not boot up after displaying INACCESSIBLE_BOOT_DEVICE. This can be a nightmare situation for IT solution providers (whether you’re an IT Admin or a Managed Service Provider – or MSP), especially those who setup a standard patch schedule across all of their machines. More recently the Windows 10 anniversary update caused thousands of webcams to freeze up and stop working. This kind of patch is usually less urgent to deal with and the fix is relatively straight forward, but it’s not always the case.

What to do when a patch goes wrong?

When a patch goes wrong it can be something as small as a feature stops working correctly. However, sometimes a server will fail to boot, causing massive disruption to an organisation. If you’re a small business, then you’re unlikely to have a full business continuity system in place. Even if you have backups, you won’t want to use them until you have performed some comprehensive troubleshooting since it can be more work to restore a system from backup than it takes to resolve the issue in the first place.

Your response to an issue after installing a patch will vary depending on its severity. If a server has installed a patch, rebooted and won’t start up correctly, it’s important not to panic! At first you may not realise that a patch has caused the issue. This is where experience comes into play. Here are a few tips to help in these situations:

  • Don’t panic
    Panicking gets you nowhere, managing patches that go horribly wrong should not be new to you.
  • Start keeping good notes
    A good administrator/technician always keeps comprehensive notes – even under pressure. The more information you keep, the easier it will be to manage the situation and justify your time and actions to others later.
  • Assess the situation
    Are multiple devices experiencing the issue? Are the issues affecting lots of users or is it just a small handful of desktops?
  • Prioritise issues to be worked on
    For MSPs, not all clients pay for the same level of service, so start prioritising and concentrate on your premium clients and work your way down.
  • Communicate
    This is a vital step, when employees are sitting at their desks unable to work due to a server failure they will wonder what’s going on and assume nothing is happening. Assign someone to keep users updated and let them know engineers are on the case.
  • Initiate business continuity plans
    Some companies will have backup servers ready to be spun up at a moment’s notice. If this is the case and the issue has not been resolved after basic troubleshooting, then action these plans.
  • Remote access
    If you’ve setup ILO or Drac access to servers then you can begin remote troubleshooting straight away. Either working on a single issue or spreading multiple cases across your team. This is where implementing a standard server build helps. Having remote access to devices even when the operating system is not working will save you time and money in the long run.

Fall back options

When a patch goes horribly wrong you have to act quickly so the first troubleshooting steps will usually involve one or more of the following:

  • Remove the updates
    One of your first troubleshooting steps will usually be to remove the patches either manually or via your management tools.
  • System restore
    When a large update is installed a system restore point is created giving you the option to manually restore the system if something goes wrong.
  • Backup
    If all else fails, then it’s time to fire up your backup tools and restore the system or the complete operating system.

Your patching options

So what are your patching options to try and reduce the chances of a patch blowing up? Everyone has their own opinions and experiences, but generally a business will adopt one or a combination of the following policies:

  • Never patch
    Never patching is not really an option, so unless you have a reckless disregard for security or the system is never going to be on the internet or never have new software installed then this really isn’t an option.
  • Manually install at a convenient time
    Manually installing patches when its quiet, and you can afford some downtime, can be done. This is only viable if you manage a very small handful of devices and you’re the onsite admin for the network. Even then, you leave yourself open to security issues, and it can take up too much of your time. Many small businesses adopt the manual patching method but these are usually very small businesses who are self-managed and the patching is usually done by the business owner or one of the employees – when they remember to do it.
  • Automated patching of critical patches and manually install optional patches
    Another patching method is to install critical patches every evening or once a week and install optional ones either manually or every other week. This strikes a good balance between stability and keeping your system up to date.
  • Testing in a virtual environment
    Applying patches in a test environment before rolling them out to your devices is possible. Unless you are restoring a backup from the previous evening onto exactly the same hardware then you can’t guarantee what happens in a virtual environment will happen in production. Doing this can take up a lot time and unless you are in a large organisation you may not feel you have the resources available to do this, however, as an MSP, installing in a test environment is exactly what is required to identify harmful patches before deploying to a wider audience.
  • Automated patching of all available updates weekly
    One valid option is to delay your patching by a few days. Microsoft releases new patches at set times – this used to be new patches on the second Tuesday of every month but is due to change for a range of platforms in October 2016. This means you can choose to play it safe and let others test the patches first, and wait for the fallout in the forums if it goes wrong. This also gives Microsoft time to revoke patches or release further fixes to resolve known issues. Automating the installation of patches on a Friday night gives you the weekend to check if things have gone wrong. As long as you have server monitoring setup then you can be notified of any major outages caused by a bad patch.
  • Use your tools to automate the process
    Having the right tools in place helps a lot and can save you time. Centralised patch management is vital for managing any number of workstations and servers. Being able to deploy patches easily on a schedule and even have the ability to remove them if needed. Automation is key to successful patch management and tools such as Windows Server Update Services (WSUS) or using Patch Management are great choices.

Summary

Installing updates and patches is vital to ensure your networks are protected against security threats, are as stable as possible and that new features are available to users. Issues will always crop up and it’s how you prepare and respond to these issues that shows how professional you are.

Define your procedures and ensure everyone knows them… and remember during a crisis, communication and keeping calm are vital.

Want to stay up to date?

Get the latest MSP tips, tricks, and ideas sent to your inbox each week.

Loading form....

If the form does not load in a few seconds, it is probably because your browser is using Tracking Protection. This is either an Ad Blocker plug-in or your browser is in private mode. Please allow tracking on this page to request a trial.

Note: Firefox users may see a shield icon to the left of the URL in the address bar. Click on this to disable tracking protection for this session/site