The purpose of this appendix is to demonstrate that the function:
is a convex function of the variational parameters .
We note first that affine transformations do not change convexity
properties. Thus convexity in
implies convexity in the variational parameters
. It remains
to show that
is a convex function of the vector ;
here we have indicated the discrete values in the range of the random
variable X by
and denoted the probability measure on
such values by
. Taking the gradient of f with respect to
gives:
where defines a probability distribution. The convexity is
revealed by a positive semi-definite Hessian
, whose
components in this case are
To see that is positive semi-definite, consider
where is the variance of a discrete random variable Z
which takes the values
with probability
.