Blog Viewer

ILTA Just-In-Time: Preparing for Cloud Outages

By Tara Saylor posted an hour ago

  

Please enjoy this blog post co-authored by Tara Saylor, Intranet and Web Tools Manager, Bryan Cave Leighton Paisner LLP, Corey Thomas, Chief Technology Officer, Lightfoot, Franklin & White, L.L.C., and Thomas Witherspoon, Senior Systems Support Engineer, Sidley Austin LLP.

Outages happen
 
Even in the age of cloud-based solutions, outages are an unpleasant part of life in IT.  While cloud architecture has improved stability and uptime, the inevitable outages seem broader-reaching when they do occur. 

While cloud outages aren’t something a team can repair directly, there are steps that can be taken to prepare before an outage happens.  

Risk Assessment
 
The first step to managing risk for a cloud outage is to understand and document your portfolio. Start by identifying cloud frameworks for key applications. Are they hosted in AWS? Do they use Microsoft for SSO? 
 
Ideally, this would all be documented in a Configuration Management Database (CMDB) or portfolio management tool. However, having a spreadsheet with critical dependencies is a solid start.  List applications, dependencies and integrations so that it is understood what services will be impacted when an outage strikes.

Establish Backups and Redundancy 
 
As a follow-up to assessing risk to a cloud outage, look for areas where backups, redundancy, or diversity can be established.  Can the cloud vendor place resources in multiple regions or datacenters? Additionally, working towards a state where not all key services exist within the same backbone provider helps to mitigate the impact during the inevitable outage. Pay special attention to break-glass credentials or emergency notification tools during this step. 

Set up Monitoring and Alerting
 
Is email acting up for one attorney, or is it down for the entire firm? Is there one person with issues, or is half the city experiencing the same problem? Well-planned alerts help determine the scope of the issues. 
 
Unfortunately, there’s no one tool that does it all. Cloud service providers frequently offer alerting pages and emails for their individual services, and there are many crowdsource tools. At the end of this article is a list of free or low-cost tools as a place to start.
 
Alerts should be checked periodically to confirm they’re providing accurate information.  

Plan Communications 
 
The middle of an outage is not the time to start wondering if an email should be sent. Plan a strategy for tailored, timely notifications that fits firm culture and communication strategies in advance. 
 
Consider when notifications will be helpful and when they’re likely to introduce more confusion. Draft sample text or build templates so that plugging in details is all that’s needed. Remember that different problems may need different approaches.  Updates about a localized ISP issue may just need to go to one office, while an outage for a vendor may need to go to everyone. 
 
Don’t forget to think about how these messages will be delivered. Out of band options such as SMS alerting can be leveraged to raise awareness when primary communications channels are incapacitated by an outage.  Like monitoring tools, these channels should also be regularly tested and validated.

Build an Incident Response Plan
 
The steps outlined above build the backbone of your Incident Response Plan.  Tabletop exercises should be planned, executed, and documented so that when an outage occurs, team members are executing from muscle memory instead of scrambling to find the runbook. Don't forget to include the steps to ensure systems are back online once the outage ends.  Make sure you know where the runbooks are! 

Prioritize Business Continuity
 
Business continuity planning in a cloud-dependent environment focuses less on eliminating outages and more on preparing for and recovering from them. This includes evaluating vendor resilience (e.g., regional redundancy), understanding dependencies across the technology stack, and ensuring systems are architected to minimize widespread disruption. 
 
Firms must weigh cost versus risk when considering redundancy strategies like multi-cloud or regional failover, recognizing that downtime is inevitable. Continuity also depends on having alternative communication channels, tested recovery processes (such as disaster recovery exercises), and documented system dependencies to quickly assess impact.
 
Ultimately, organizations are responsible for maintaining operations, so continuity planning must emphasize preparedness, recovery speed, and informed vendor management rather than relying on any single provider to prevent outages.

Crowdsource Monitoring Resources

        Downdetector.com

        Reddit.com

r/outages

r/Azure

r/cloudcomputing

r/aws

r/sysadmin

r/devops

r/googlecloud

        Cisco Thousand Eyes (Subscription Service)

        Azure Status Page

        ILTA Desktop & Application Services Community

        Bleeping Computer


Other Content for further study:

        Building and Managing Your Firm’s Application Portfolio (ILTA, 2025)

        Generative AI for Ourselves (ILTA, 2025)

        Model Context Protocol Within Legal (ILTA, 2025)

        Tip of the Week: When the Cloud Falls (ILTA, 2025)

        Navigating Change Management: Implementing Change Effectively (ILTA, 2025)

        Surviving Data Disasters: How to Prepare and recover from Data-Loss Events (ILTA, 2024)

        Mitigating Risk with Cloud Technologies (ILTA, 2023)

        Continuously Monitoring Controls in a Cloud Environment (ILTA, 2023)

        Enterprise business continuity management customer and cloud partner responsibilities (Microsoft, 2025)




#Cloud
#Just-in-Time
#Firm
#RiskManagement
#CloudCommunications
#100Level
#200Level

0 comments
212 views

Permalink