PhD Thesis Nico Piatkowski

Exponential families on resource-constrained systems

This work is about the estimation of exponential family models on resource-constrained systems. Our main goal is learning probabilistic models on devices with highly restricted storage, arithmetic, and computational capabilities—so called, ultra-low-power devices. Enhancing the learning capabilities of such devices opens up opportunities for intelligent ubiquitous systems in all areas of life, from medicine, over robotics, to home automation—to mention just a few. We investigate the inherent resource consumption of exponential families, review existing techniques, and devise new methods to reduce the resource consumption. The resource consumption, however, must not be reduced at all cost. Exponential families possess several desirable properties that must be preserved: Any probabilistic model encodes a conditional independence structure—our methods keep this structure intact. Exponential family models are theoretically well-founded. Instead of merely finding new algorithms based on intuition, our models are formalized within the framework of exponential families and derived from first principles. We do not introduce new assumptions which are incompatible with the formal derivation of the base model, and our methods do not rely on properties of particular high-level applications. To reduce the memory consumption, we combine and adapt reparametrization and regularization in an innovative way that facilitates the sparse parametrization of high-dimensional non-stationary time-series. The procedure allows us to load models in memory constrained systems, which would otherwise not fit. We provide new theoretical insights and prove that the uniform distance between the data generating process and our reparametrized solution is bounded. To reduce the arithmetic complexity of the learning problem, we derive the integer exponential family, based on the very definition of sufficient statistics and maximum entropy estimation. New integer-valued inference and learning algorithms are proposed, based on variational inference, proximal optimization, and regularization. The benefit of this technique is larger, the weaker the underlying system is, e.g., the probabilistic inference on a state-of-the-art ultra-lowpower microcontroller can be accelerated by a factor of 250. While our integer inference is fast, the underlying message passing relies on the variational principle, which is inexact and has unbounded error on general graphs. Since exact inference and other existing methods with bounded error exhibit exponential computational complexity, we employ near minimax optimal polynomial approximations to yield new stochastic algorithms for approximating the partition function and the marginal probabilities. Changing the polynomial degree allows us to control the complexity and the error of our new stochastic method. We provide an error bound that is parametrized by the number of samples, the polynomial degree, and the norm of the model’s parameter vector. Moreover, important intermediate quantities can be precomputed and shared with the weak computational device to reduce the resource requirement of our method even further. All new techniques are empirically evaluated on synthetic and real-world data, and the results confirm the properties which are predicted by our theoretical derivation. Our novel techniques allow a broader range of models to be learned on resource-constrained systems and imply several new research possibilities.