Here I build a case for using probabilistic knowledge in a Computer Chess Program. This
article is currently under construction.
Quotes are presented from 5 references:
The Art of Positional Play by Samuel Reshevsky
Computers in Chess: Solving Inexact Search Problems by M. M. Botvinnik
Probabilistic Reasoning in Intelligent Systems by Judea Pearl
Bayesian Networks by Pourret, Naim and Marcot
Bayesian Networks and Influence Diagrams by Kjaerulff and Madsen
"...the strategizing must preserve uncertainty and allow for contingencies." - William H. Starbuck
Samuel Reshevsky in The Art of Positional Play believes that we should go about selecting
moves and making plans in a chess game according to the possibilities available in the current position we are facing on the
board.
p.ix "The business of the chess player is to conceive practical objectives and to plan and carry out the
maneuvers necessary to achieve them; the objectives, the plans, the maneuvers - all must be based on the possibilities inherent
in actual positions. Thus chess is by definition positional."
How, in a given position, should we determine what is possible? Capablanca and Botvinnik both
thought that positional chess involves the control of fields, as in this extended quote from Botvinnik's Computers in
Chess: Solving Inexact Search Problems:
p.38"The positional estimate should not be a
general-purpose affair; it should be specific to each given situation... Capablanca pointed out that the basis for a positional
estimate is the control of fields. Control of fields does not mean control of the whole board, but control of only those fields
that may be used in the impending play. Therefore, one must strive for control of the field consisting of those trajectories
in which the pieces can move, but have not moved yet.
At the node in the search tree where we
find ourselves at a given moment, we must unravel all those sheaves of trajectories which have not yet been developed and
determine which player has control of the majority of the fields consisting of the trajectories not yet used in
the play. This allows us to forecast the result of the play - the result of a search which, in particular, had to be renounced
at the terminal nodes of the variations for lack of resources."
p.38"We shall show later that the positional
estimate allows us to solve the question of priorities... Thus the positional estimate, with the development of the sheaf
of trajectories, should be produced at every node in the search tree . We may assert that the squares under control
define the usable mobility and maneuverability of the pieces. Better maneuverability of pieces often also determines the positional
superiority."
p.39"It was assumed that the control of squares involves only those pieces that lie at a distance of one
move from the controlled square (and a blockading piece must lie on the square itself).
p.39"To sum up: The positional estimate is computed at every node of the search tree. The procedure is substantially
more complex than the procedure for computing a material score. All sheaves of trajectories included in the play but not yet
used (in whole or in part) are taken into account in computing the positional value."
p.39"we see that we may get a first approximation on those trajectories on which the pieces have not yet
had time to move.
Thus the basic factor in the positional estimate is proportional to the ratio Kw /
Kb, where Kw and Kb are the numbers of squares controlled by White
and Black, respectively."
How do we go about implementing this idea? Determining exactly which side
controls specific squares in a chess game is time consuming and possibly not determinable for certain. Do we have another
way to approach this problem?
Let's look at some quotes from Judea Pearl's book Probabilistic Reasoning in Intelligent Systems.
In order to reason effectively in an uncertain world, we need to make simplifications. Instead of
using rules and an endless list of exceptions to these rules, we might just use probability to summarize our degree of uncertainty.
p.1-2"Reasoning about any realistic domain always requires that
some simplification be made. The very act of preparing knowledge to support reasoning requires that we leave many facts unknown,
unsaid, or crudely summarized... An alternative to the extremes of ignoring or enumerating exceptions is to summarize them,
i.e., provide some warning signs to indicate which areas of the minefield are more dangerous than others. Summarization is
essential if we wish to find a reasonable compromise between safety and speed of movement. This book studies a language in
which summaries of exceptions in the minefield of judgment and belief can be represented and processed... One way to summarize
exceptions is to assign to each proposition a numerical measure of uncertainty and then combine these measures according to
uniform syntactic principles"
What if we base our search focus for our computer chess program on moves that create promising positions?
We would evaluate our pieces on their effectiveness in the game. We can build a model of 'piece effectiveness' and base it
on the relationships each piece has with the other pieces. We would then need to find a way to build this model in a way that
was fast but reasonably accurate. We might choose to use probability as our basis for evaluation since certainty
takes time and we do not have much time to generate our evaluation.
p.12"Our goal is to make intensional systems [Intensional
systems deal with uncertainty in a context sensitive manner. They try to model the interdependencies and relevance relationships
of the variables in the system. - JLJ] operational by making relevance relationships explicit, thus curing the impotence
of declarative statements such as P(B|A)=p [JLJ - notation for the statement: the probability of B given A is p]. As mentioned
earlier, the reason one cannot act on the basis of such declarations is that one must first make sure that other items
in the knowledge base are irrelevant to B and hence can be ignored. The trick, therefore, is to encode knowledge in such a
way that the ignorable is recognizable, or better yet, that the unignorable is quickly identified and is readily accessible...
In effect, what network representations offer is a dynamically updated list of all currently valid licenses to ignore, and
licenses to ignore constitute permissions to act."
Our model of piece effectiveness (in our evaluation function) should show how each chess piece contributes
to the effectiveness of the other pieces - if it in fact does so. We separate out what is relevant from what
is irrelevant, and we use probability if we are not sure.
p.13"A central requirement for managing intensional systems is to articulate the conditions under
which one item of information is considered relevant to another, given what we already know, and to encode knowledge in structures
that display these conditions vividly as the knowledge undergoes changes."
Our goal here is to produce a computational model of intelligent behavior.
p.14"The aim of artificial intelligence is to provide a computational
model of intelligent behavior, most importantly, commonsense reasoning."
Pearl's book is based on probability, but he claims that probability is the structure of reasoning.
p.15"this book will try to communicate the idea that 'probability
is not really about numbers; it is about the structure of reasoning,' as Glenn Shafer recently wrote."
Our search for 'promising positions' in our search tree might use
heuristics constructed for this purpose, and might involve generating an influence diagram for each piece.
p.306"Clearly, the only practical way of doing planning in an uncertain
domain is to generate portions of the decision tree on the fly from more economical forms of knowledge... The difficulty
with such a scheme is that the construction of any decision tree requires three diverse sources of knowledge, each organized
by a different set of principles:
1. Causal knowledge about how events influence each other in the
domain.
2. Knowledge about what action sequences are feasible in any given
set of circumstances.
3. Normative knowledge about how desirable the consequences are.
... Influence diagrams are an attempt to capture all three knowledge
sources in one graphical representation."
p.311"an influence diagram can be evaluated by sequentially instantiating the decision and observation
nodes (in chronological order) while treating the remaining chance nodes as a Bayesian network that supplies the probabilistic
parameters necessary for tree evaluation."
We should give some thought to the information we seek to acquire - is it really worth the effort
involved, or is there a simpler heuristic which is more easily computable which gives us nearly the same answer?
p.313"6.3.1 Information Sources and Their Values: It is generally
accepted that information is a useful commodity, that acting in an informed fashion is preferable to acting under ignorance.
This is why people accumulate information when it is available and purchase information when it is scarce. People also possess
strong intuition about whether one information source is more valuable (more reliable and pertinent) than another... The value
of any information source is defined as the difference between the utilities of two optimal strategies, one providing the
freedom of choosing different actions for different source outcomes, the other providing no such freedom. This criterion can
be used to rate the usefulness of various information sources and to decide whether a piece of information is worth acquiring."
Critical to the success of our computer chess program is focusing its attention on the relevant
lines of analysis, and ignoring the lines that do not matter.
p.318"6.4.1 Focusing Attention: Control is the process
of scheduling the activation of information sources, both external (e.g., acquiring new input) and internal (e.g., invoking
rules or updating beliefs). Decision analysis provides a framework for scheduling all computational activities so as to focus
on specific goals - updating the belief in a target set of hypotheses, shifting attention to a new set, and terminating the
activity once we reach an acceptable level of confidence in a hypothesis.
The main reason for focusing attention on a select set of target hypotheses is to economize
the acquisition of new data. Let us imagine a subset S of the nodes (normally the leaves) that are known to be sensory or
observable nodes for a given problem domain (e.g., laboratory tests in medical diagnosis). In general, the instantiation of
any of these sensory nodes incurs a positive cost, and the utility of the information they convey might be insufficient to
justify this cost. Thus, it is important to decide which node in S should be instantiated first, based on the information
it contributes to the decision at hand, i.e., the target node. If utility information is available, then the value node naturally
is the target. If we lack utility information, we assign priorities to pending information sources based on their degree of
informativeness."
Setting subgoals, such as searching for positions where our pieces have influence and are fully
engaged (perhaps menacing weaknesses in the opponent's position of defending weaknesses in our position), are good sub-goals
if we cannot search for checkmate directly.
p.326"The task of controlling reasoning activities was formulated as that of finding an optimal
schedule for activating information sources. Decision theory provides a framework for assessing the knowledge and computations
needed to perform this optimization precisely. It turns out that the knowledge required is often unavailable... Subgoaling
strategies emerge as a reasonable compromise; they are computationally tractable... they still provide a focused way of acquiring
information."
Let's now look at some quotes from Bayesian Networks by Pourret, Naim and Marcot.
We might begin by looking at probabilistic models (successfully) constructed for other applications,
and we see that there is no reason we cannot try to construct a model of this type for use in a computer
chess program.
xi"Bayesian
networks, named after the works of Thomas Bayes (ca. 1702-1761) on the theory of probability, have emerged as the result of
mathematical research carried out in the 1980s, notably by Judea Pearl at UCLA, and from that time on, have proved successful
in a large variety of applications.
This
book is intended for users, and also potential users of Bayesian networks: engineers, analysts, researchers, computer
scientists, students and users of other modeling or statistical techniques. It has been written with a dual purpose in mind:
- highlight the versatility and modeling power of Bayesian networks, and also discuss their limitations
and failures, in order to help potential users to assess the adequacy of Bayesian networks to their needs;
- provide practical guidance on constructing and using Bayesian networks."
A model is an effective tool for simplifying a complex situation, and it allows us to manipulate the
model in a way that allows strategic insight.
p.1"Real-world problems... are often described
as complex... Furthermore... a variety of factors... tend to distort our judgment of a situation.
One way of trying to better handle
reality - in spite of these limitations and biases - is to use representations of reality called models."
A model might be constructed to satisfy the needs of an individual or organization,
perhaps to obtain insight for planning purposes.
p.2"the purpose of a model is to satisfy
the need of some person or organization having a particular interest in one or several aspects of the object, but not in a
comprehensive understanding of its properties."
p.3"Definition 2 (Model) A model is
a representation of an object, expressed in a specific language and in a usable form, and is intended to satisfy one or several
need(s) of some stakeholder(s) of the object."
We use models to produce information and to help us reason about uncertainty.
p.3"Models are thus used to produce information
(evaluations, appropriate decisions or actions) on the basis of some input information, considered as valid. This process
is called inference."
A model should have elements, and should say how it works.
p.4"the way a model is constructed obviously
depends on several factors, such as the nature of the object, the stakeholder's need(s), the available knowledge and information,
the time and resources devoted to the model elaboration, etc. Nevertheless, we may identify two invariants in the process
of constructing a model... Splitting the object into elements... Saying how it works: the modeling language"
One way to construct a model is with a graph or picture showing the relationships among the various
parts.
p.5"most successful or unsuccessful attempts
of mankind to overcome the complexity of reality have involved, at some stage, a form [of] a graphical representation."
We assign the name variables to the parts of our model that are unknown.
p.5"During the modeling process, the exact
circumstances in which the model is going to be used (especially, what input data the model will process) are, to a large
extent, unknown. Also, some of the attributes remain unknown when the model is used: the attributes which are at some stage
unknown are more conveniently described by variables."
We can use probability to model our doubt in the state of influence of one variable over another.
p.7"Doubt is a typically human faculty
which can be considered as the basis of any scientific process... The construction of a probabilistic model requires the systematic
examination of all possible values of each variable... it is hard to imagine a more precise representation of an object: each
of the theoretically possible configurations of the object is considered, and to each of them is associated one element of
the infinite set [0;1]. [JLJ - a probability between 0 and 1]"
If we can split our model into several subsets (which are ideally independent from each other), it
becomes easier to operate.
p.9"Following Descartes's precept of dividing
the difficulties, one may try to split the set of n variables into several subsets of smaller sizes which can relatively
be analyzed separately... Then the modeling problem can be transformed into two simpler ones."
A Bayesian network is a graphical representation of the influences that one variable has over another
variable.
p.11"In the lorry [truck] driver and doped
athlete examples, we have identified the most direct and significant influences between the variables, and simplified the
derivation of the joint probability distribution. By representing these influences in a graphical form, we now introduce the
notion of [a] Bayesian network."
We can draw conclusions from a Bayesian network via the propagation of evidence.
p.26"Inference [section title] The most
crucial task of an expert system is to draw conclusions based on new evidence. The mechanism of drawing conclusions in a system
that is based on a probabilistic graphical model is known as propagation of evidence. Propagation of evidence involves
essentially updating probabilities given observed variables of a model (also known as belief updating)."
We might instead use a rule-based system to model our network of interacting pieces on the chessboard.
We might feel safe because we do not have to address uncertainty. But this does not guarantee that our rule-based
model is any better, or that it is correct. It might just simply be wrong because we just do not know enough to be certain
how we stand.
p.31"Rule-based systems capture heuristic
knowledge from the experts and allow for a direct construction of a classification relation... Rule-based systems may be expected
to perform well for problems that cannot be modeled using causality as a guiding principle, or when a problem is too complicated
to be modeled as a causal graph."
Bayesian networks are effective tools for modeling processes of medical reasoning. Decision making
in an emergency room is quite similar to choosing moves in a chess game. Bayesian networks have been successfully created
to aid decision making in an emergency room. Perhaps they can be used as well to aid decision making in a chess game.
p.32"Bayesian networks are recognized as
a convenient tool for modeling processes of medical reasoning. There are several features of Bayesian networks that are specially
useful in modeling in medicine. One of these features is that they allow us to combine expert knowledge with existing clinical
data."
p.54"The use of Bayesian networks in biomedical
sciences can be traced as far back as the early decades of the 20th century, when Sewell Wright developed path analysis to
aid the study of genetic inheritance. Neglected for many years, Bayesian Networks were reintroduced in the early 1980s as
an analytic tool capable [of] encoding the information acquired from human experts. Compared to decision-rule based 'expert-systems'
that were limited in their ability to reason under uncertainty, Bayesian networks were probabilistic expert systems that used
probability theory to account for uncertainty in automated reasoning for diagnostic and prognostic tasks. This type of probabilistic
reasoning was made possible by the development of algorithms to propagate probabilistic information through a network."
Bayesian networks have been useful in other domains:
p.71"Bayesian networks provide a flexible
modeling framework to describe complex systems in a modular way."
p.84"The BN [Bayesian network] can provide
useful information for crime risk factor analysis."
p.185"Bayesian networks provide a general
and effective framework for knowledge representation and reasoning under uncertainty."
An influence diagram can be constructed from a Bayesian network.
p.210"Once the BN [Bayesian network] has
been constructed, it is enlarged by including decision and utility nodes, thus transforming it into an influence diagram."
Bayesian networks allow us to use various sources of knowledge.
p.384"Although Bayesian networks are certainly
not the Holy Grail of artificial intelligence, they definitely are a solid basis for knowledge engineering. They allow us
to use various sources of knowledge, even contradicting ones, to make knowledge embedded in data explicit, to use this knowledge
for various types of problem solving, and finally to improve it through online learning.
Artificial intelligence remains a challenge
for the next decades. Indeed, intelligence cannot be limited to inference and learning, but requires action. Embedding artificial
intelligence systems in the real world is probably the next challenge of artificial intelligence, far beyond simply connecting
an offline 'artificially intelligent system' to external sensors and actuators."
Let's now look at quotes from Bayesian Networks and Influence Diagrams by Kjaerulff and Madsen.
If you wish to look at practical applications for Bayesian networks, this book is a good place to
start.
VII-VIII"This book is a monograph [a scholarly piece of writing of essay or book length on a specific,
often limited subject -JLJ] on practical aspects of probabilistic networks (a.k.a. probabilistic graphical models) and is
intended to provide a comprehensive guide for practitioners that wish to understand, construct, and analyze decision support
systems based on probabilistic networks, including a number of different variants of Bayesian networks and influence diagrams...
inference in probabilistic networks is based on a well-established theoretical foundation of probability calculus and decision
theory, and hence provides mathematically coherent methods for deriving conclusions under uncertainty, where multiple sources
of information and complex interaction patterns are involved"
We build our probabilistic networks with a goal in mind: to solve an intellectually challenging task,
or to derive conclusions from a body of knowledge.
p.3"Solving an intellectually challenging task can be characterized as a process of deriving conclusions
(new pieces of knowledge) by manipulating a (large) body of knowledge, typically including definitions of entities (objects,
concepts, events, phenomena, etc.), relations among them, and observations of states (values) of some of the entities."
We first identify the relevant variables and the causal relations among them. For our chess program,
this is essentially the influence each piece exerts in the game - the pressure (constrained or unconstrained) it exerts on
other pieces and the constraints it places on the ability of the enemy pieces to pressure our pieces.
p.10"The construction of a Bayesian network thus runs in two phases. First, given the problem at hand,
one identifies the relevant variables and the (causal) relations among them. The resulting DAG [acyclic directed graph] specifies
a set of dependence and independence assumptions that will be enforced on the joint probability distributions... one for each
'family'... of the DAG. A Bayesian network can be constructed manually, (semi-) automatically from data, or through
a combination of a manual and a data driven process, where partial knowledge about a structure as well as parameters (i.e.,
conditional probabilities) blend with statistical information extracted from databases of cases (i.e., previous joint observations
of values of the variables)... Extensive guidance on how to manually construct a probabilistic network is the core of this
book."
An important step in evaluation of piece pressure is determining which pieces have no influence on
the piece in question and can be eliminated from calculations of effectiveness.
p.49"The single most important key to efficient inference in probabilistic networks is the ability
to take advantage of the distributive law (i.e., to find optimal (or near optimal) sequences in which the variables are marginalized
out)... Variables of a probabilistic network that have no descendants and are never observed are called barren variables,
as they provide no information relevant for the inference process... and may hence be removed from the network."
p.63"Many real-life situations can be modeled as a domain of entities represented as random variables
in a probabilistic network. A probabilistic network is a clever graphical representation of dependence and independence relations
between random variables... A probabilistic network represents and processes probabilistic knowledge...The graphical representation
of a probabilistic network describes knowledge of a problem domain in a precise manner. The graphical representation is intuitive
and easy to comprehend, making it an ideal tool for communication of domain knowledge between experts, users, and systems.
For these reasons, the formalism of probabilistic networks is becoming an increasingly popular knowledge representation for
reasoning and decision making under uncertainty."
p.74"Decision Making Under Uncertainty The framework of influence diagrams (Howard & Matheson
1981) is an effective modeling framework for representation and analysis of (Bayesian) decision making under uncertainty.
Influence diagrams provide a natural representation for capturing the semantics of decision making with a minimum of clutter
and confusion for the decision maker (Shachter & Peot 1992). Solving a decision problem amounts to (i) determining an
optimal strategy that maximizes the expected utility for the decision maker and (ii) computing the maximal expected utility
of adhering to this strategy."
p.107"We build knowledge bases in order to formulate our knowledge about a certain problem domain
in a structured way. The purpose of the knowledge base is to support our reasoning about events and decisions in a domain
with inherent uncertainty... An expert system consists of a knowledge base and an inference engine... The knowledge base is
the Bayesian network or influence diagram, whereas the inference engine consists of a set of generic methods that applies
the knowledge formulated in the knowledge base on task-specific data sets, known as evidence, to compute solutions to queries
against the knowledge base. The knowledge base alone is of limited use if it cannot be applied to update our belief about
the state of the world or to identify (optimal) decisions in the light of new knowledge... the knowledge bases we consider
are probabilistic networks."
p.111"Given a query and a set of evidence variables, the contribution from a nuisance variable does
not depend on the observed values of the evidence variables. Hence, if a query is to be solved with respect to multiple instantiations
over the evidence variables, then the nuisance variables (and barren variables) may be eliminated in a preprocessing step
to obtain the relevant network (Lin and Druzdzel 1997). The relevant network consists of target variables, evidence variables,
and variables on paths between target and evidence variables only."
p.122"Probabilistic inference is the task of updating our belief about the state of the world in light
of evidence. Evidence on discrete variables, be it hard or soft evidence, is treated as in the case of discrete Bayesian networks."
p.124"We build decision models in order to support efficient reasoning and decision making under uncertainty
in a given problem domain. Reasoning under uncertainty is the task of computing our updated beliefs in (unobserved) events
given observations on other events whereas decision making under uncertainty is the task of identifying the (optimal) decision
strategy for the decision maker given observations."
p.137"We build decision models in order to support efficient reasoning and decision making under uncertainty
in a given problem domain. Reasoning under uncertainty is the task of computing our updated beliefs in (observed) events given
observations on other events [i.e., evidence] whereas decision making under uncertainty is the task of identifying the (optimal)
decision strategy for the decision maker given observations."
p.144"There are many good reasons to choose probabilistic networks as the modeling framework, including
the coherent and mathematically sound handling of uncertainty and normative decision making, the automated construction and
adaptation of models based on data, the intuitive and compact representation of cause-effect relations and (conditional) dependence
and independence relations, the efficient solution of queries given evidence, and the ability to support a whole range of
analyses of the results produced, including conflict analysis, sensitivity analysis (with respect to both parameters and evidence),
and value-of-information analysis."
p.145"There are four ground characteristics that constitute the foundation of (normative) probabilistic
models:
Graphical representation of causal relations among domain entities (variables). The notion of causality
is central in probabilistic networks, meaning that a directed link from one variable to another (usually) signifies a causal
relation among the two...
Strengths of probabilistic relations are represented by (conditional) probabilities. Causal relations
among variables are seldom deterministic in the sense that if the cause is present, then the effect can be concluded by certainty...
Preferences are represented as utilities on a numerical scale. All sorts of preferences that are relevant
in a decision scenario must be expressed on a numerical scale...
Recommendations are based on the principle of maximal expected utility. As the reasoning performed
by a probabilistic network is normative, the outcome (e.g., most likely diagnosis or suggested decision) is guaranteed to
provide a recommended course of action that maximizes the expected utility to the extent that the model is a 'true' representation
of problem domain."
p.146-147"we might set up the following criteria to be met for probabilistic networks to potentially
be a good candidate technology for solving the problem at hand:
Well defined variables. The variables and events (i.e., possible values of the variables) of the problem
domain need to be well-defined...
Highly structured problem domain with identifiable cause-effect relations. Well-established and detailed
knowledge should be available concerning structure (variables and (causal) links), conditional probabilities, and utilities
(preferences). In general, the structure needs to be static (i.e., not changing over time), although re-estimation of structure
(often the usage of learning tools; see chapter 8) can be performed...
Uncertainty associated with the cause-effect relations. If all cause-effect relations are deterministic
(i.e., all conditional probabilities either take the value 0 or the value 1), more efficient technologies probably exist...
Repetitive problem solving. Often, for the (sometimes large) effort invested in constructing a probabilistic
network to pay off, the problem solved should be of a repetitive nature. A physician diagnosing respiratory diseases, an Internet
company profiling their customers, and a bank deciding to grant loans to its customers are all examples of problems that need
to be solved over and over again, where the involved variables and causal mechanisms are invariant over time, and only the
values observed for (some of) the variables differ...
Maximization of expected utility. For the probabilistic network framework to be a natural choice,
the problem at hand should most probably contain an element of decision making involving a desire to maximize the expected
utility of a decision."
p.148"Constraint variables (see chapter 7) also depend deterministically on its parent variables.
Such 'artificial' variables can be handy in many modeling situations, for example, reducing the number of conditional probabilities
needed to be specified or enforcing constraints on the combinations of states among a subset of the variables."
p.149-150"Identifying the variables of a problem domain is not always an easy task, and requires some
practicing... one needs to focus on the problem (possible diagnoses, classifications, predictions, decisions, etc. to be made)
and the relevant pieces of information for solving the problem... In the process of identifying the variables it can be useful
to distinguish between different types of variables:
Problem variables: These are the variables of interest; i.e., those for which we want to compute their
posterior probability given observations of values for information variables (see next item). Usually, the values of problem
variables cannot be observed; otherwise, there would not be any point in constructing a probabilistic network in the first
place...
Information variables: These are the variables for which observations may be available, and which
can provide information relevant for solving the problem. Two sub-categories of information variables can be distinguished:
Background information... Symptom information...
Mediating variables: These are unobservable variables for which posterior probabilities are not of
immediate interest, but which play important roles for achieving correct conditional independence and dependence properties
and/or efficient inference."
p.152-153"Given an initial set of variables identified for a given problem domain, the next step in
the model construction process concerns the identification and verification of (causal) links of the model."
p.170"When constructing a model (probabilistic or not) it is crucial to realize that real-world problem
domains are usually embedded in a complex reality involving interaction with numerous different aspects of the real world
in a way that can never be fully captured in a model. Also, the internal causal mechanisms of a problem domain can almost
always only be approximately described in a model. Thus it is important to bear in mind that all models are wrong, but that
some might be useful."
p.171"In his writings, William of Occam (or Ockham) (1284-1347) stressed the Aristotelian principle
that entities must not be multiplied beyond what is necessary. This principle became known as Occam's Razor or the law of
parsimony; a problem should be stated in its basic and simplest terms. In science, the simplest theory that fits the facts
of the problem is the one that should be selected. This rule is interpreted to mean that the simplest of two or more competing
theories is preferable and that an explanation for unknown phenomena should first be attempted in terms of what is already
known."
p.174"we pointed to the fact that the best models are usually constructed through deliberate use of
the law of parsimony (or Occam's razor)."
p.220"An influence diagram is useful for solving problems of decision making under uncertainty. The
variables of an influence diagram consist of a mixture of random variables and decision variables. The random variables are
used for representing uncertainty while the decision variables represent entities under the full control of the decision maker.
The state of a random variable may be observable or hidden while the state of a decision variable is under the full control
of the decision maker."
p.261"It is difficult or even impossible to construct models covering all aspects of (complex) problem
domains of interest. A model is therefore most often an approximation of a problem domain that is designed to be applied according
to the assumptions as determined by the background condition or context of the model. If a model is used under circumstances
not consistent with the background condition, the results will in general be unreliable."
|