Save page

Why Passing Tests Doesn’t Reduce Surprise

22 Apr 2026
Steve

Organizations are testing more than ever. Most firms can now point to mapped business services, defined recovery objectives, structured exercises and regular reporting to boards. From a governance perspective, that represents meaningful progress.

And yet disruptive incidents continue to unfold in ways that feel out of proportion to what was rehearsed.

That is not because exercises are poorly run. In many organizations, they are thoughtful, well facilitated and supported by senior stakeholders. The more uncomfortable observation is that testing often provides confidence within the frame we design, but less visibility beyond it.

Most exercises are built around validating recovery strategies. A disruption is defined, systems are assumed unavailable, teams follow documented responses and performance is measured against agreed tolerances. If recovery is achieved within the expected timeframe, the outcome is recorded as a success.

There is nothing wrong with that. Boards need evidence. Regulators expect demonstration. Structured exercises provide clarity and accountability.

Working assumptions

But success in that setting usually tells us that recovery works under the assumptions embedded in the scenario. It tells us less about how those assumptions behave when conditions begin to drift.

Every scenario needs boundaries. It has to define what has failed, what remains available and how dependencies are expected to respond. Without those constraints, exercises become unfocused and difficult to manage.

The trade-off is that those same boundaries shape what can be discovered.

If a recovery plan assumes that a key supplier responds within a defined timeframe, and the scenario models that response as expected, the exercise confirms coordination. It does not explore what happens when that supplier is dealing with concurrent demand across multiple clients. If manual processes are tested at steady volumes, the exercise demonstrates capability at that level. It does not necessarily reveal how human load accumulates under escalation.

In that sense, passing an exercise often confirms that a design works in controlled conditions.

Real disruption rarely stays controlled.

An example

Consider a critical SaaS platform that meets its recovery objective during testing. Failover is simulated, data integrity is confirmed and vendor communications follow the expected script. The exercise concludes with confidence that the contractual recovery timeline is achievable.

Months later, a live incident occurs. The same platform experiences disruption, but this time several major clients invoke recovery simultaneously. Vendor support queues lengthen. Status updates become less precise. Internal teams escalate while awaiting clarity, and key individuals find themselves dividing attention between recovery, stakeholder updates and parallel operational demands. None of this violates the contract. But coordination slows.

The service does not collapse, but it stretches. Resolution takes longer than rehearsed. Stakeholders begin to ask why the exercise did not surface these dynamics.

The explanation is rarely that the exercise was flawed. It is that the exercise isolated one stressor, whereas the real event layered several.

Over time, exercise outputs accumulate into assurance artefacts. Reports are presented. Metrics are tracked. Governance forums see structured evidence of preparedness. Confidence grows, and in many respects rightly so.

The difficulty is that confidence can stabilise faster than underlying conditions evolve. Dependencies change. Volumes fluctuate. Staff move roles. Suppliers adjust operating models. Documentation may remain current, but the interaction between components shifts subtly.

When exercises consistently confirm performance within expected parameters, attention tends to focus on maintaining that performance rather than probing where it might thin.

How to approach recovery

Where things begin to change is when we look not only at whether recovery works, but at how it behaves when conditions are less tidy than the model assumed.

Structured scenario testing performs an essential role. It creates repeatability, comparability and governance discipline, and it is well suited to demonstrating preparedness. Discovery, however, often requires something slightly different.

Exploratory exercises are rarely as tidy. They may introduce ambiguity rather than clarity. They may layer moderate stresses instead of modelling a single defined event. They may deliberately remove certain assumptions and observe how teams respond when information is incomplete.

For example, instead of fixing the duration of a supplier outage in advance, an exercise might allow uncertainty to remain and adjust conditions as the scenario unfolds. Instead of assuming full staff availability, it may introduce partial absence mid-discussion. Instead of isolating one disruption, it may combine a technical failure with elevated demand.

These variations are not about engineering catastrophe. They are about observing where coordination begins to stretch, where manual effort accumulates and where dependencies compete for the same capacity.

They are harder to score and less comfortable to report. That is why they are less common.

The risk is not that organizations fail to test. It is that they expect one format to provide both governance reassurance and insight into systemic limits.

Passing a test confirms that recovery functions within defined conditions. It does not automatically reveal how it behaves beyond them. Reducing surprise requires understanding how that recovery behaves when conditions are less controlled, less isolated and more layered than the scenario assumed.

In complex operational environments, disruption is rarely singular. It is concurrent, uneven and cumulative. The question is not simply whether plans exist, but how they respond when several moderate pressures combine.

Surprise rarely appears because nothing was tested.

More often, it appears because the conditions that mattered most were never tested together.

About the author

Steve Hine

Resilience Manager

Steven Hine is a Business Continuity and IT Disaster Recovery Manager with extensive experience delivering operational resilience and recovery programmes within UK-regulated financial-services organisations.

He specialises in developing and embedding practical recovery strategies, coordinating responses to disruptive incidents, and strengthening the resilience of critical business services.

Steven has delivered programmes aligned to PRA and FCA expectations, ISO 22301, and DORA requirements, supporting firms in identifying important business services, defining impact tolerances, and implementing effective technical and procedural controls.

His experience spans major incident management, disaster recovery testing, business continuity, and technology operations, with a focus on improving recoverability, communication, and cross-team coordination during real disruption.

Why Passing Tests Doesn’t Reduce Surprise

Working assumptions

An example

How to approach recovery

More on

About the author

Steve Hine

Events

BCI Canada Chapter: Vancouver In-Person Networking Event

Unlock Organizational Resilience with BCI Corporate Membership

BCI London Chapter: AI Incident Response Live Crisis Exercise

News

8 Steps to an Effective Business Continuity Planning Program

Meet the Sponsor: Fortiv

BCAW+R Day 5: Think Your Leadership Culture Can Handle Crisis? Think Again!

You may also be interested in

Resilience Isn't a Plan. It's a Culture.

Building Capabilities That Strengthen Organizational Advantage

How Leaders and Teams Stay Clear, Focused, & Effective When It Matters Most

When the World Doesn’t Stop: Navigating Humanitarian Response Remotely

Strengthening Organisational Resilience from the Inside Out

Turn Your Dead and Static Risk Registers Into Live and Living Ones!

Preparing for Global Crisis to Become Internal Disruption

Think You’re Ready? Equipping the Next Generation to Lead

Threat Intelligence

Beyond RTO and RPO: Why Critical Time Periods Matter in Business Continuity

Beyond the Script: Business Continuity Exercises in Social Housing

Building Workforce Health Resilience to Extreme Weather & Natural Hazards

When Umbrellas Can't Solve Everything: The Things We Forget To Plan For

Operational Resilience for Climate Shocks

Crisis Resilience Training: How to Select the Right Type of Exercise

The Uncomfortable Truth: We Need to Sweat a Little to be Crisis Ready

How We Uplifted Crisis Capability at NBN Australia

Performing Under FIRE for Business

BCAW+R 2026 White Paper: Well, Think Again! Five Dangerous Assumptions in Business Continuity & Resilience

BCAW+R 2026: A Guide to Crisis Exercises

Redefining Validation Through Agility: A Comparative Pilot of Short-Form Crisis Simulations at the British Council

When Cyclone Alfred Became a Grid Crisis

Leveraging AI to Counter Hybrid Warfare Risks

French food, Chinese Menus and How Modern Threats can be Better Understood

Beyond Backup: How Data Vaults Support Ransomware Resilience

Stressed Exit Planning: Lessons Learned from Financial Services

BCAW+R 2026 Toolkit

BCAW+R 2026 Posters

BCI AI SIG - Micro Webinars: AI for BCM - Practical Use Cases with a BIA Case Study

BCI AI SIG - Micro Webinars: AI for BCM - Governance Baseline and Guardrails

BCI AI SIG - Micro Webinars: From Zero to Prompting Hero

BCI Emergency & Crisis Communications Report 2026 - EMEA Launch

Ask the Expert: Is Our Profession Still, Just a “Tick-Box” Exercise?

BCI Emergency & Crisis Communications Report 2026

BCI Emergency & Crisis Communications Report 2026 Executive Summary

BCI Membership Brochure

Unlock Organizational Resilience with BCI Corporate Membership

National Infrastructure Cyber Attack Crisis Exercise

Next Generation Network: Shaping the Future of Resilience

BCI A Year in the World of Resilience Report 2025