![]() | ![]() | ![]() | Belief Network Inference |
Suppose we observe variables E1,...,Es have corresponding values o1...os. We want to determine the posterior probability of variable X, the query variable, given evidence E1=o1 &...&Es=os:
P(X|E1=o1 &...&Es=os)= (P(X &E1=o1 &...&Es=os))/(P(E1=o1 &...&Es=os))The denominator, P(E1=o1 &...&Es=os), is a normalizing factor:
P(E1=o1 &...&Es=os) = SUMv in dom(X) P(X=v &E1=o1 &...&Es=os)The problem of probabilistic inference can thus be reduced to the problem of computing the probability of conjunctions.
Let Y = {Y1,...,Yk} be the non-query, non-observed variables (i.e., Y = {X1,...,Xn}-{X}-{E1,...,Es}). To compute the marginal distribution, we sum out the Yi's:
where the subscripted probabilities mean that the associated variables are assigned the corresponding values.
P(X&E1=o1 &...&Es=os) = SUMYk···SUMY1 P(X1,...,Xn){E1=o1 &...&Es=os} = SUMYk···SUMY1 PRODi = 1n P(Xi|piXi){E1=o1 &...&Es=os}
Thus probabilistic inference reduces to the problem of summing out variables from a product of functions. To solve this efficiently we use the distribution law that we learned in high school: to compute a sum of products such as xy+xz efficiently, we distribute out the common factors (which here is x) which results in x(y+z). This is the essence of the VE algorithm. We call the elements multiplied together "factors" because of the use of the term in mathematics. Initially the factors represent the conditional probability tables, but the intermediate factors are just functions on variables that are created by adding and multiplying factors.
A factor on variables V1,...,Vd is a representation of a function from dom(V1) ×...×dom(Vd) into the real numbers.
Suppose that the Yi's are ordered according to some elimination ordering. We sum out the variables one at at time.
To sum out a variable Yi from a product, we distribute all of the factors that don't involve Yi out of the sum. Suppose f1,...,fk are some functions of the variables that are multiplied together (initially these are the conditional probabilities), then
SUMYi f1...fk = f1...fm SUMYi fm+1...fkwhere f1...fm are those functions that don't involve Yi, and fm+1...fk are those that do involve Yi. We explicitly construct a representation for the new function SUMYi fm+1...fk, and continue summing out the remaining variables. After all the Yi's have been summed out, the result is a function on X that is proportional to X's posterior distribution.
To compute P(X|E1=o1 &...&Es=os) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Let F be the factors obtained from the original conditional probabilities. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1. | Replace each f in F that involves some Ei with f{E1=o1 ,..., Es=os}. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2. | While there is a factor involving a non-query variable | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Select non-query variable Y to eliminate | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Set F= eliminate(Y,F). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3. | Return renormalize(F) |
Procedure eliminate(Y,F): | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Partition F into | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
{f1,..., fm} that don't contain Y and | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
{fm+1,..., fr} that do contain Y | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Compute f = SUMY fm+1×t...×tfr | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Return {f1,...,fm,f} |
Procedure renormalize({f1,...,fr}): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Compute f = f1×t...×tfr | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Compute c = SUMX f | % c is normalizing constant | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Return f/c | % divide each element of f by c |
In the tabular implementation of the VE algorithm (Figure *), a function of d discrete variables V1,...,Vd, is represented as a d-dimensional table (which can be implemented, for example, as a d-dimensional array, as a tree of depth d, or, as in our implementation, as a 1-dimensional array based on a lexicographic ordering on the variables). If f is such a table, let variables(f) = {V1,...,Vd}. We sometimes write f as f[V1,...,Vd] to make the variables explicit. f is said to involve Vi if Vi in variables(f).
There are three primitive operations on tables: setting variables, forming the product of tables, and summing a variable from a table.
Definition.
Suppose C is a set of variables, c is an assignment
C = v, and f is a factor on variables
X. Let Y = X-C, let Z =
X intersect C, and let Z = v' be the
assignment of values to Z that assigns the same values to
elements of Z as c does. Define set(f,c) be the factor on
Y given by:
set(f,c)(Y) = f(Y,Z=v').
That is, set(f,c) is a function of Y, the variables of f that are not
in c, that is like f, but with some values already assigned. Note
that, as a special case of this, if c doesn't involve any variable
in f then set(f,c)=f.
Example.
Consider the factor f(A,B,C,D,E) defined by the table of Figure
*. Some examples of the value of this function are
f(a,b,c,d,e) = 0.55, and f(
~
a,b,c,d,~
e) = 1-0.08 = 0.92.
set(f,~
a&b &e) is a function of C and
D defined by the table:
C D value
c d 0.08
c ~
d 0.08
~
c d 0.025
~
c ~
d 0.5
Definition.
The product of tables f1 and f2, written
f1×tf2 is a table on the union of the variables in f1 and
f2 (i.e., variables(f1×tf2) = variables(f1) union variables(f2)) defined by:
(f1×tf2)(X,Y,Z) =
f1(X,Y)f2(Y,Z)
where Y is variables(f1) intersect variables(f2),
X is variables(f1)- variables(f2), and
Z is variables(f2)- variables(f1).
To construct the product of tables, fm+1×t···×tfk, we union all of the variables in fm+1...fk, say these are X1,...,Xr. Then we construct an r-dimensional table so there is an entry in the table for each combination v1,...,vr where vi in dom(Xi). The value for the entry corresponding to v1,...,vr is obtained by multiplying the values obtained from each fi applied to the projection of v1,...,vr onto the variables of fi.
Definition.
The summing out of variable Y from table f, written SUMY
f is the table with
variables Z = variables(f)-{Y} such that5
(SUMY
f)(Z) = SUMvi in dom(Y) f(Z&Y=vi)
where dom(Y) = {v1,...,vs}.
Example.
Consider eliminating B from the factors of Example
* (representing the belief network of Figure
*), where all of the variables are Boolean. The factors
that contain B, namely those factors that represent P(E|ABCD) and
P(B|YZ), are removed from the set of factors. We construct a factor
f1(A,B,C,D,E,Y,Z)=P(E|A,B,C,D) ×tP(B|Y,Z), thus, for example,
and similarly for the other values of A...Z. We then need
to sum out B from f1, producing f2(A,C,D,E,Y,Z) where, for
example,
f1(a,b,c,d,e,y,z) = P(e|a&b&c&d) P(b|y&z)
f1( ~
a,b,c,d,e,y,z) = P(e|~
a&b&c&d) P(b|y&z)
f1( ~
a,~
b,c,d,e,y,z) = P(e|~
a&~
b&c&d) P(~
b|y&z)
f1(a, ~
b,c,d,e,y,z) = P(e|a&~
b&c&d) P(~
b|y&z)
f2(a,c,d,e,y,z) = f1(a,b,c,d,e,y,z)+f1(a,
f2 is then added to the set of factors.
Note that the construction of f1 is for exposition only; we don't
necessarily have to construct a table for it explicitly.
~
b,c,d,e,y,z).
![]() | ![]() | ![]() | Belief Network Inference |