Copyright (c) 2013 John L. Jerz

Bayesian Networks and Influence Diagrams (Kjaerulff, Madsen, 2008)

Home
A Proposed Heuristic for a Computer Chess Program (John L. Jerz)
Problem Solving and the Gathering of Diagnostic Information (John L. Jerz)
A Concept of Strategy (John L. Jerz)
Books/Articles I am Reading
Quotes from References of Interest
Satire/ Play
Viva La Vida
Quotes on Thinking
Quotes on Planning
Quotes on Strategy
Quotes Concerning Problem Solving
Computer Chess
Chess Analysis
Early Computers/ New Computers
Problem Solving/ Creativity
Game Theory
Favorite Links
About Me
Additional Notes
The Case for Using Probabilistic Knowledge in a Computer Chess Program (John L. Jerz)
Resilience in Man and Machine

A Guide to Construction and Analysis

Bayesian.jpg

Probabilistic networks, also known as Bayesian networks and influence diagrams, have become one of the most promising technologies in the area of applied artificial intelligence, offering intuitive, efficient, and reliable methods for diagnosis, prediction, decision making, classification, troubleshooting, and data mining under uncertainty.

Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis provides a comprehensive guide for practitioners who wish to understand, construct, and analyze intelligent systems for decision support based on probabilistic networks. Intended primarily for practitioners, this book does not require sophisticated mathematical skills or deep understanding of the underlying theory and methods nor does it discuss alternative technologies for reasoning under uncertainty. The theory and methods presented are illustrated through more than 140 examples, and exercises are included for the reader to check his/her level of understanding.

The techniques and methods presented for knowledge elicitation, model construction and verification, modeling techniques and tricks, learning models from data, and analyses of models have all been developed and refined on the basis of numerous courses that the authors have held for practitioners worldwide.

VII-VIII This book is a monograph [a scholarly piece of writing of essay or book length on a specific, often limited subject -JLJ] on practical aspects of probabilistic networks (a.k.a. probabilistic graphical models) and is intended to provide a comprehensive guide for practitioners that wish to understand, construct, and analyze decision support systems based on probabilistic networks, including a number of different variants of Bayesian networks and influence diagrams... inference in probabilistic networks is based on a well-established theoretical foundation of probability calculus and decision theory, and hence provides mathematically coherent methods for deriving conclusions under uncertainty, where multiple sources of information and complex interaction patterns are involved
 
p.3 Solving an intellectually challenging task can be characterized as a process of deriving conclusions (new pieces of knowledge) by manipulating a (large) body of knowledge, typically including definitions of entities (objects, concepts, events, phenomena, etc.), relations among them, and observations of states (values) of some of the entities.
 
p.3 By formulating the physician's knowledge in an appropriate formal (computer) language for which there exist methods for making inferences to manipulate pieces of knowledge formulated in this language, the reasoning conducted by the physician can be automated and carried out by a computer. [JLJ - perhaps this concept can be used to "teach" a machine how to "play" a game]
 
p.4 Randomness and uncertain judgment [are] inherent in most real-world decision problems. We therefore need a method (paradigm) that supports representation of quantitative measures of uncertain statements and a method for combining the measures such that reasoning and decision making under uncertainty can be automated.
 
p.6 A rule-based system, like any other knowledge representation scheme, represents a certain part of the world (the problem domain) only up to some precision. This implies that certain (causal) mechanisms might be ignored as being unimportant for the precision (or level of detail) at which conclusions need to be drawn... Violating the "causal direction" in formulating rules is, however, not advisable.
 
p.10 The construction of a Bayesian network thus runs in two phases. First, given the problem at hand, one identifies the relevant variables and the (causal) relations among them. The resulting DAG [acyclic directed graph] specifies a set of dependence and independence assumptions that will be enforced on the joint probability distributions... one for each '"family"... of the DAG.
  A Bayesian network can be constructed manually, (semi-) automatically from data, or through a combination of a manual and a data driven process, where partial knowledge about a structure as well as parameters (i.e., conditional probabilities) blend with statistical information extracted from databases of cases (i.e., previous joint observations of values of the variables)... Extensive guidance on how to manually construct a probabilistic network is the core of this book.
 
p.17 Probabilistic networks are graphical models of (causal) interactions among a set of variables, where the variables are represented as vertices (nodes) of a graph and the interactions (direct dependencies) as directed edges (links or arcs) between the vertices. Any pair of unconnected vertices of such a graph indicates (conditional) independence between the variables represented by these vertices under particular circumstances that can easily be read from the graph. Hence, probabilistic networks capture a set of (conditional) dependencies and independence properties associated with the variables represented in the network.
  Graphs have proven themselves an intuitive language for representing such dependence and independence statements, and thus provide excellent language for communicating and discussing dependence and independence relations among problem-domain variables.
 
p.24, 25 Causality plays an important role in the process of constructing probabilistic network models. There are a number of reasons why proper modeling of causal relations is important or helpful... one does not have to construct a model where the links can be interpreted as causal relations, it just makes the model much more intuitive, eases the process of getting the dependence and independence relations right, and significantly eases the process of eliciting the conditional probabilities of the model.
 
p.49 The single most important key to efficient inference in probabilistic networks is the ability to take advantage of the distributive law (i.e., to find optimal (or near optimal) sequences in which the variables are marginalized out)... Variables of a probabilistic network that have no descendants and are never observed are called barren variables, as they provide no information relevant for the inference process... and may hence be removed from the network.
 
p.63 Many real-life situations can be modeled as a domain of entities represented as random variables in a probabilistic network. A probabilistic network is a clever graphical representation of dependence and independence relations between random variables... A probabilistic network represents and processes probabilistic knowledge...The graphical representation of a probabilistic network describes knowledge of a problem domain in a precise manner. The graphical representation is intuitive and easy to comprehend, making it an ideal tool for communication of domain knowledge between experts, users, and systems. For these reasons, the formalism of probabilistic networks is becoming an increasingly popular knowledge representation for reasoning and decision making under uncertainty.
 
p.74 Decision Making Under Uncertainty
The framework of influence diagrams (Howard & Matheson 1981) is an effective modeling framework for representation and analysis of (Bayesian) decision making under uncertainty. Influence diagrams provide a natural representation for capturing the semantics of decision making with a minimum of clutter and confusion for the decision maker (Shachter & Peot 1992). Solving a decision problem amounts to (i) determining an optimal strategy that maximizes the expected utility for the decision maker and (ii) computing the maximal expected utility of adhering to this strategy.
 
p.107 We build knowledge bases in order to formulate our knowledge about a certain problem domain in a structured way. The purpose of the knowledge base is to support our reasoning about events and decisions in a domain with inherent uncertainty... An expert system consists of a knowledge base and an inference engine... The knowledge base is the Bayesian network or influence diagram, whereas the inference engine consists of a set of generic methods that applies the knowledge formulated in the knowledge base on task-specific data sets, known as evidence, to compute solutions to queries against the knowledge base. The knowledge base alone is of limited use if it cannot be applied to update our belief about the state of the world or to identify (optimal) decisions in the light of new knowledge... the knowledge bases we consider are probabilistic networks.
 
p.111 Given a query and a set of evidence variables, the contribution from a nuisance variable does not depend on the observed values of the evidence variables. Hence, if a query is to be solved with respect to multiple instantiations over the evidence variables, then the nuisance variables (and barren variables) may be eliminated in a preprocessing step to obtain the relevant network (Lin and Druzdzel 1997). The relevant network consists of target variables, evidence variables, and variables on paths between target and evidence variables only.
 
p.122 Probabilistic inference is the task of updating our belief about the state of the world in light of evidence. Evidence on discrete variables, be it hard or soft evidence, is treated as in the case of discrete Bayesian networks.
 
p.124 We build decision models in order to support efficient reasoning and decision making under uncertainty in a given problem domain. Reasoning under uncertainty is the task of computing our updated beliefs in (unobserved) events given observations on other events whereas decision making under uncertainty is the task of identifying the (optimal) decision strategy for the decision maker given observations.
 
p.137 We build decision models in order to support efficient reasoning and decision making under uncertainty in a given problem domain. Reasoning under uncertainty is the task of computing our updated beliefs in (observed) events given observations on other events [i.e., evidence] whereas decision making under uncertainty is the task of identifying the (optimal) decision strategy for the decision maker given observations.
 
p.144 There are many good reasons to choose probabilistic networks as the modeling framework, including the coherent and mathematically sound handling of uncertainty and normative decision making, the automated construction and adaptation of models based on data, the intuitive and compact representation of cause-effect relations and (conditional) dependence and independence relations, the efficient solution of queries given evidence, and the ability to support a whole range of analyses of the results produced, including conflict analysis, sensitivity analysis (with respect to both parameters and evidence), and value-of-information analysis.
 
p.145 There are four ground characteristics that constitute the foundation of (normative) probabilistic models:
 
Graphical representation of causal relations among domain entities (variables). The notion of causality is central in probabilistic networks, meaning that a directed link from one variable to another (usually) signifies a causal relation among the two...
 
Strengths of probabilistic relations are represented by (conditional) probabilities. Causal relations among variables are seldom deterministic in the sense that if the cause is present, then the effect can be concluded by certainty...
 
Preferences are represented as utilities on a numerical scale. All sorts of preferences that are relevant in a decision scenario must be expressed on a numerical scale...
 
Recommendations are based on the principle of maximal expected utility. As the reasoning performed by a probabilistic network is normative, the outcome (e.g., most likely diagnosis or suggested decision) is guaranteed to provide a recommended course of action that maximizes the expected utility to the extent that the model is a 'true' representation of problem domain.
 
p.146-147 we might set up the following criteria to be met for probabilistic networks to potentially be a good candidate technology for solving the problem at hand:
 
Well defined variables. The variables and events (i.e., possible values of the variables) of the problem domain need to be well-defined...
 
Highly structured problem domain with identifiable cause-effect relations. Well-established and detailed knowledge should be available concerning structure (variables and (causal) links), conditional probabilities, and utilities (preferences). In general, the structure needs to be static (i.e., not changing over time), although re-estimation of structure (often the usage of learning tools; see chapter 8) can be performed...
 
Uncertainty associated with the cause-effect relations. If all cause-effect relations are deterministic (i.e., all conditional probabilities either take the value 0 or the value 1), more efficient technologies probably exist...
 
Repetitive problem solving. Often, for the (sometimes large) effort invested in constructing a probabilistic network to pay off, the problem solved should be of a repetitive nature. A physician diagnosing respiratory diseases, an Internet company profiling their customers, and a bank deciding to grant loans to its customers are all examples of problems that need to be solved over and over again, where the involved variables and causal mechanisms are invariant over time, and only the values observed for (some of) the variables differ...
 
Maximization of expected utility. For the probabilistic network framework to be a natural choice, the problem at hand should most probably contain an element of decision making involving a desire to maximize the expected utility of a decision.
 
p.148 Constraint variables (see chapter 7) also depend deterministically on its parent variables. Such "artificial" variables can be handy in many modeling situations, for example, reducing the number of conditional probabilities needed to be specified or enforcing constraints on the combinations of states among a subset of the variables.
 
p.149-150 Identifying the variables of a problem domain is not always an easy task, and requires some practicing... one needs to focus on the problem (possible diagnoses, classifications, predictions, decisions, etc. to be made) and the relevant pieces of information for solving the problem... In the process of identifying the variables it can be useful to distinguish between different types of variables:
 
Problem variables: These are the variables of interest; i.e., those for which we want to compute their posterior probability given observations of values for information variables (see next item). Usually, the values of problem variables cannot be observed; otherwise, there would not be any point in constructing a probabilistic network in the first place...
 
Information variables: These are the variables for which observations may be available, and which can provide information relevant for solving the problem. Two sub-categories of information variables can be distinguished: Background information... Symptom information...
 
Mediating variables: These are unobservable variables for which posterior probabilities are not of immediate interest, but which play important roles for achieving correct conditional independence and dependence properties and/or efficient inference.
 
p.152-153 Given an initial set of variables identified for a given problem domain, the next step in the model construction process concerns the identification and verification of (causal) links of the model.
 
p.170 When constructing a model (probabilistic or not) it is crucial to realize that real-world problem domains are usually embedded in a complex reality involving interaction with numerous different aspects of the real world in a way that can never be fully captured in a model. Also, the internal causal mechanisms of a problem domain can almost always only be approximately described in a model. Thus it is important to bear in mind that all models are wrong, but that some might be useful.
 
p.171"In his writings, William of Occam (or Ockham) (1284-1347) stressed the Aristotelian principle that entities must not be multiplied beyond what is necessary. This principle became known as Occam's Razor or the law of parsimony; a problem should be stated in its basic and simplest terms. In science, the simplest theory that fits the facts of the problem is the one that should be selected. This rule is interpreted to mean that the simplest of two or more competing theories is preferable and that an explanation for unknown phenomena should first be attempted in terms of what is already known.
 
p.174 we pointed to the fact that the best models are usually constructed through deliberate use of the law of parsimony (or Occam's razor).
 
p.220 An influence diagram is useful for solving problems of decision making under uncertainty. The variables of an influence diagram consist of a mixture of random variables and decision variables. The random variables are used for representing uncertainty while the decision variables represent entities under the full control of the decision maker. The state of a random variable may be observable or hidden while the state of a decision variable is under the full control of the decision maker.
 
p.261 It is difficult or even impossible to construct models covering all aspects of (complex) problem domains of interest. A model is therefore most often an approximation of a problem domain that is designed to be applied according to the assumptions as determined by the background condition or context of the model. If a model is used under circumstances not consistent with the background condition, the results will in general be unreliable.

Enter supporting content here