xi-xii Resilience engineering makes it clear that failures and successes are closely related phenomena
and not incompatible opposites. Whereas established safety approaches hold that the transition from a safe to an
unsafe state is tantamount to the failure of some component or subsystem and therefore focus on what has gone or might go
wrong, resilience engineering proposes that:
... an unsafe state may arise because system adjustments are insufficient or inappropriate rather
than because something fails... Since both failures and successes are the outcome of normal performance variability,
safety cannot be achieved by constraining - or eliminating that. Instead, it is necessary to study both successes
and failures, and to find ways to reinforce the variability that leads to successes as well as dampen the variability that
leads to adverse outcomes... effective safety management cannot be based on a reactive approach alone... it is
necessary also to make corrections or changes in anticipation of what may happen... a resilient system is
defined by its ability effectively to adjust its functioning prior to or following changes and disturbances
so that it can continue its functioning after a disruption or a major mishap, and in the presence of continuous
stresses.
p.1 The designer, the planner, and the systems operator must always keep in mind that things could go wrong...
The "faint signals" that are often the precursors of trouble need to be heard and sent to competent authority for
action.
p.4-5 Resilience seems to be closely linked with some sort of insight into the (narrowly defined)
system, the (broadly defined) environment in which it exists, and their interactions... Resilience involves anticipation...
Deeper understanding allows at least two sources of resilience. One is to know sooner when "things are going wrong"
by picking up faint signals of impending dysfunction. The other is to have better knowledge resources that are available in
order to develop adaptive resources "on the fly." It follows that the lack of such understanding diminishes
resilience. It also follows that resulting choices that lack an understanding of how to create, configure, and operate
a system lead to less resilient (more brittle) systems. Resilience can be seen in action, and is made visible
through the way that safety and risk information are used. Resilience is an active process that implicitly draws on
the way that an organization or society can organize itself. It is more than just a set of resources because it involves
adaptation to varying demands and threats. Adaptation and restructuring make it possible for an organization to meet
varying, even unanticipated, demands.
p.5-6 resilience... [anticipates] what future events may challenge system performance.
More importantly, resilience is about having the generic ability to cope with unforeseen challenges, and having adaptable
reserves and flexibility to accommodate those challenges... resilience invests flexibility and the ability to find
and use available resources in a system in order to meet the changes that are inherent in a dynamic world... Making
changes to systems in anticipation of needs in order to meet future demands is the engineering of resilience.
p.6 To measure something, we must know its essential properties. Resilience of
materials must be measured by experiment in order to find how much a material returns to its original shape. The
same can be said for systems. The act of measurement is the key for engineers to begin to understand the nature of an unexampled
event, and the probability part of Probabilistic Risk Assessment (PRA).
p.29-30 Among the definitions of resilience are an ability to resist disorder (Fiskel, 2003), as well as
an ability to retain control, to continue and to rebuild (Hollnagel & Woods, 2006)... it may be possible only to measure [a
system's] potential for resilience... The following factors are thought to contribute to resilience (Woods, 2006):
- buffering capacity...
- flexibility/stiffness...
- margin...
- tolerance...
- cross-scale interactions
p.31 Broadly speaking, measurement may be defined as the "process of linking abstract concepts to empirical
indicants" (Carmines & Zeller, 1979)... In other words, a valid measurement is one that is capable of accessing
a phenomenon and placing its value along some scale.
p.128-129 intuitively, a robust or resilient system is one which must be able to adapt its behaviour
to unforeseen situations, such as perturbations in the environment, or to internal dysfunctions in the organisation
of the system, etc. ... a resilient system generally aims to restore the initial functions of the system without
fundamentally questioning its internal structure in charge of the regulation... From a system theory point of view, the processes
linked to robustness are very different since:
1) they inevitably do not guarantee that the function of the systems will be maintained; new functions can
emerge in the system (e.g. a new organisation or new objectives for a company, etc.)
2) it is difficult to disassociate the system from its environment since the two entities can be
closely coupled
p.129 for McDonald, resilience represents:
the capacity of an organizational system to anticipate and manage risk effectively, through appropriate
adaptation of its actions, systems and processes so as to ensure that its core functions are carried out in a stable and effective
relationship with the environment
p.130 Woods defines a resilient system as one which is able to monitor the boundary of its organization
capability and which can adapt or adjust its current model... an agent or a structure is able to anticipate unforeseen
circumstances in an intelligent way in order to drive back the system to its initial state... Following this complex system
point of view, we stress that it is necessary to distinguish between resilient engineering that is concerned with
the aim to bring back the system in its initial conditions and robustness engineering which is able to harness
the more complex (and hidden) properties of self-organized processes.
p.188 The safety fundamentals for system safety architecture
and technology... are: transparency, redundancy, interdependence, functionality, integrity, and maintainability.
p.256 Based on these elements, cross-checking fundamentally consists
in being able to question a plan in progress at any given level of the model, comparing elements to expected ones.
Expectations might differ from current elements because they are based on a different knowledge of the situation: e.g., the
situation has evolved and new events have occurred, the agent cross-checking has a different perspective... The need
to detect emerging effects in order to potentially recover from unintended negative outcomes is also essential.
p.260 In analogy with this we may in accident investigation propose a What-You-Look-For-Is-What-You-Find
or WYLFIWYF principle. The meaning of this is that the assumptions about possible causes (What-You-Look-For)
to a large extent will determine what is actually found (What-You-Find)... a root cause analysis implies that accidents
can be explained by finding the root - or real- causes. The assumption is in this case that the accident can be described
as a sequence, or tree, of causes and effects.