Software Libraries
- Juice.jl a
Julia
package for logical and probabilistic circuits with fastCUDA
bindings - SPFlow an easy and extensible
python
library for SPNs compiling them topytorch
,tensorflows
andC
code - LibSPN-Keras Keras-centered TensorFlow 2.x compatible library for SPNs
- LibSPN tensorflow implementation of SPNs with bindings in
python
- Libra Toolkit Learning and inference routines
for ACs implemented in
Ocaml
- BayesianSumProductNetworks.jl Julia implementation of Bayesian structure and parameter learning.
- SumProductNetworks.jl Software package for SPNs.
julia
- Tachyon structure and parameter learning in
python3
- GoSPN implementing LearnSPN in Go
Go
- SPPL probabilistic programming language that translates generative code to generalized SPNs.
Papers with code
-
SPPL: Probabilistic Programming With Fast Exact Symbolic Inference[ paper | code | bibtex | abstract ]
We present the Sum-Product Probabilistic Language (SPPL), a new probabilistic programming language that automatically delivers exact solutions to a broad range of probabilistic inference queries. SPPL translates probabilistic programs into sum-product expressions, a new symbolic representation and associated semantic domain that extends standard sum-product networks to support mixed-type distributions, numeric transformations, logical formulas, and pointwise and set-valued constraints. We formalize SPPL via a novel translation strategy from probabilistic programs to sum-product expressions and give sound exact algorithms for conditioning on and computing probabilities of events. SPPL imposes a collection of restrictions on probabilistic programs to ensure they can be translated into sum-product expressions, which allow the system to leverage new techniques for improving the scalability of translation and inference by automatically exploiting probabilistic structure. We implement a prototype of SPPL with a modular architecture and evaluate it on benchmarks the system targets, showing that it obtains up to 3500x speedups over state-of-the-art symbolic systems on tasks such as verifying the fairness of decision tree classifiers, smoothing hidden Markov models, conditioning transformed random variables, and computing rare event probabilities.
@inproceedings{saad2021sppl,
title = {{SPPL:} Probabilistic Programming with Fast Exact Symbolic Inference},
author = {Saad, Feras A. and Rinard, Martin C. and Mansinghka, Vikash K.},
booktitle = {Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation},
pages = {804--819},
numpages = {16},
year = {2021},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
doi = {10.1145/3453483.3454078},
isbn = {9781450383912},
keywords = {probabilistic programming, symbolic execution},
location = {Virtual, Canada},
} -
Sum-product graphical models[ paper | code | bibtex | abstract ]
This paper introduces a probabilistic architecture called sum–product graphical model (SPGM). SPGMs represent a class of probability distributions that combines, for the first time, the semantics of probabilistic graphical models (GMs) with the evaluation efficiency of sum–product networks (SPNs): Like SPNs, SPGMs always enable tractable inference using a class of models that incorporate context specific independence. Like GMs, SPGMs provide a high-level model interpretation in terms of conditional independence assumptions and corresponding factorizations. Thus, this approach provides new connections between the fields of SPNs and GMs, and enables a high-level interpretation of the family of distributions encoded by SPNs. We provide two applications of SPGMs in density estimation with empirical results close to or surpassing state-of-the-art models. The theoretical and practical results demonstrate that jointly exploiting properties of SPNs and GMs is an interesting direction of future research.
@article{desana2020sum-product,
author = {Desana, Mattia and Schnoerr, Christoph},
title = {Sum-product graphical models},
journal = {Mach. Learn.},
volume = {109},
number = {1},
pages = {135--173},
year = {2020}
} -
Visualizing and understanding Sum-Product Networks[ paper | code | bibtex | abstract ]
Sum-Product Networks (SPNs) are deep tractable probabilistic models by which several kinds of inference queries can be answered exactly and in a tractable time. They have been largely used as black box density estimators, assessed by comparing their likelihood scores on different tasks. In this paper we explore and exploit the inner representations learned by SPNs. By taking a closer look at the inner workings of SPNs, we aim to better understand what and how meaningful the representations they learn are, as in a classic Representation Learning framework. We firstly propose an interpretation of SPNs as Multi-Layer Perceptrons, we then devise several criteria to extract representations from SPNs and finally we empirically evaluate them in several (semi-)supervised tasks showing they are competitive against classical feature extractors like RBMs, DBNs and deep probabilistic autoencoders, like MADEs and VAEs.
@article{vergari2019visualizing,
author = {Vergari, Antonio and Di Mauro, Nicola and Esposito, Floriana},
title = {Visualizing and understanding Sum-Product Networks},
journal = {Mach. Learn.},
volume = {108},
number = {4},
pages = {551--573},
year = {2019}
} -
Maximum A Posteriori Inference in Sum-Product Networks[ paper | code | bibtex | abstract ]
Sum-product networks (SPNs) are a class of probabilistic graphical models that allow tractable marginal inference. However, the maximum a posteriori (MAP) inference in SPNs is NP-hard. We investigate MAP inference in SPNs from both theoretical and algorithmic perspectives. For the theoretical part, we reduce general MAP inference to its special case without evidence and hidden variables; we also show that it is NP-hard to approximate the MAP problem to 2nϵ for fixed 0≤ϵ<1, where n is the input size. For the algorithmic part, we first present an exact MAP solver that runs reasonably fast and could handle SPNs with up to 1k variables and 150k arcs in our experiments. We then present a new approximate MAP solver with a good balance between speed and accuracy, and our comprehensive experiments on real-world datasets show that it has better overall performance than existing approximate solvers.
@inproceedings{mei2018maximum,
author = {Mei, Jun and Jiang, Yong and Tu, Kewei},
title = {Maximum {A} Posteriori Inference in Sum-Product Networks},
booktitle = {{AAAI}},
pages = {1923--1930},
publisher = {{AAAI} Press},
year = {2018}
} -
Sum-Product Autoencoding: Encoding and Decoding Representations Using Sum-Product Networks[ paper | code | bibtex | abstract ]
Sum-Product Networks (SPNs) are a deep probabilistic architecture that up to now has been successfully employed for tractable inference. Here, we extend their scope towards unsupervised representation learning: we encode samples into continuous and categorical embeddings and show that they can also be decoded back into the original input space by leveraging MPE inference. We characterize when this Sum-Product Autoencoding (SPAE) leads to equivalent reconstructions and extend it towards dealing with missing embedding information. Our experimental results on several multi-label classification problems demonstrate that SPAE is competitive with state-of-the-art autoencoder architectures, even if the SPNs were never trained to reconstruct their inputs.
@inproceedings{vergari2018sum-product,
author = {Vergari, Antonio and Peharz, Robert and Di Mauro, Nicola and Molina, Alejandro and Kersting, Kristian and Esposito, Floriana},
title = {Sum-Product Autoencoding: Encoding and Decoding Representations Using
Sum-Product Networks},
booktitle = {{AAAI}},
pages = {4163--4170},
publisher = {{AAAI} Press},
year = {2018}
} -
Mixed Sum-Product Networks: {A} Deep Architecture for Hybrid Domains[ paper | code | bibtex | abstract ]
While all kinds of mixed data---from personal data, over panel and scientific data, to public and commercial data---are collected and stored, building probabilistic graphical models for these hybrid domains becomes more difficult. Users spend significant amounts of time in identifying the parametric form of the random variables (Gaussian, Poisson, Logit, etc.) involved and learning the mixed models. To make this difficult task easier, we propose the first trainable probabilistic deep architecture for hybrid domains that features tractable queries. It is based on Sum-Product Networks (SPNs) with piecewise polynomial leaf distributions together with novel nonparametric decomposition and conditioning steps using the Hirschfeld-Gebelein-Renyi Maximum Correlation Coefficient. This relieves the user from deciding a-priori the parametric form of the random variables but is still expressive enough to effectively approximate any distribution and permits efficient learning and inference. Our experiments show that the architecture, called Mixed SPNs, can indeed capture complex distributions across a wide range of hybrid domains.
@inproceedings{molina2018mixed,
author = {Molina, Alejandro and Vergari, Antonio and Di Mauro, Nicola and Natarajan, Sriraam and Esposito, Floriana and Kersting, Kristian},
title = {Mixed Sum-Product Networks: {A} Deep Architecture for Hybrid Domains},
booktitle = {{AAAI}},
pages = {3828--3835},
publisher = {{AAAI} Press},
year = {2018}
} -
Fast and Accurate Density Estimation with Extremely Randomized Cutset Networks[ paper | code | bibtex | abstract ]
Cutset Networks (CNets) are density estimators leveraging context-specific independencies recently introduced to provide exact inference in polynomial time. Learning a CNet is done by firstly building a weighted probabilistic OR tree and then estimating tractable distributions as its leaves. Specifically, selecting an optimal OR split node requires cubic time in the number of the data features, and even approximate heuristics still scale in quadratic time. We introduce Extremely Randomized Cutset Networks (XCNets), CNets whose OR tree is learned by performing random conditioning. This simple yet surprisingly effective approach reduces the complexity of OR node selection to constant time. While the likelihood of an XCNet is slightly worse than an optimally learned CNet, ensembles of XCNets outperform state-of-the-art density estimators on a series of standard benchmark datasets, yet employing only a fraction of the time needed to learn the competitors. Code and data related to this chapter are available at: https://github.com/nicoladimauro/cnet.
@inproceedings{dimauro2017fast,
author = {Di Mauro, Nicola and Vergari, Antonio and Maria Altomare Basile, Teresa and Esposito, Floriana},
title = {Fast and Accurate Density Estimation with Extremely Randomized Cutset
Networks},
booktitle = {{ECML/PKDD} {(1)}},
series = {Lecture Notes in Computer Science},
volume = {10534},
pages = {203--219},
publisher = {Springer},
year = {2017}
} -
Alternative Variable Splitting Methods to Learn Sum-Product Networks[ paper | code | bibtex | abstract ]
Sum-Product Networks (SPNs) are recent deep probabilistic models providing exact and tractable inference. SPNs have been successfully employed as density estimators in several application domains. However, learning an SPN from high dimensional data still poses a challenge in terms of time complexity. This is due to the high cost of determining independencies among random variables (RVs) and sub-populations among samples, two operations that are repeated several times. Even one of the simplest greedy structure learner, LearnSPN, scales quadratically in the number of the variables to determine RVs independencies. In this work we investigate approximate but fast procedures to determine independencies among RVs whose complexity scales in sub-quadratic time. We propose two procedures: a random subspace approach and one that adopts entropy as a criterion to split RVs in linear time. Experimental results prove that LearnSPN equipped by our splitting procedures is able to reduce learning and/or inference times while preserving comparable inference accuracy.
@inproceedings{dimauro2017alternative,
author = {Di Mauro, Nicola and Esposito, Floriana and Ventola, Fabrizio G. and Vergari, Antonio},
title = {Alternative Variable Splitting Methods to Learn Sum-Product Networks},
booktitle = {AI*IA},
series = {Lecture Notes in Computer Science},
volume = {10640},
pages = {334--346},
publisher = {Springer},
year = {2017}
} -
Online Structure Learning for Sum-Product Networks with Gaussian Leaves[ paper | code | bibtex | abstract ]
Sum-product networks have recently emerged as an attractive representation due to their dual view as a special type of deep neural network with clear semantics and a special type of probabilistic graphical model for which inference is always tractable. Those properties follow from some conditions (i.e., completeness and decomposability) that must be respected by the structure of the network. As a result, it is not easy to specify a valid sum-product network by hand and therefore structure learning techniques are typically used in practice. This paper describes the first online structure learning technique for continuous SPNs with Gaussian leaves. We also introduce an accompanying new parameter learning technique.
@inproceedings{hsu2017online,
author = {Hsu, Wilson and Kalra, Agastya and Poupart, Pascal},
title = {Online Structure Learning for Sum-Product Networks with Gaussian Leaves},
booktitle = {{ICLR} (Workshop)},
publisher = {OpenReview.net},
year = {2017}
} -
On the Latent Variable Interpretation in Sum-Product Networks[ paper | code | bibtex | abstract ]
One of the central themes in Sum-Product networks (SPNs) is the interpretation of sum nodes as marginalized latent variables (LVs). This interpretation yields an increased syntactic or semantic structure, allows the application of the EM algorithm and to efficiently perform MPE inference. In literature, the LV interpretation was justified by explicitly introducing the indicator variables corresponding to the LVs’ states. However, as pointed out in this paper, this approach is in conflict with the completeness condition in SPNs and does not fully specify the probabilistic model. We propose a remedy for this problem by modifying the original approach for introducing the LVs, which we call SPN augmentation. We discuss conditional independencies in augmented SPNs, formally establish the probabilistic interpretation of the sum-weights and give an interpretation of augmented SPNs as Bayesian networks. Based on these results, we find a sound derivation of the EM algorithm for SPNs. Furthermore, the Viterbi-style algorithm for MPE proposed in literature was never proven to be correct. We show that this is indeed a correct algorithm, when applied to selective SPNs, and in particular when applied to augmented SPNs. Our theoretical results are confirmed in experiments on synthetic data and 103 real-world datasets.
@article{peharz2017latent,
author = {Peharz, Robert and Gens, Robert and Pernkopf, Franz and Domingos, Pedro M.},
title = {On the Latent Variable Interpretation in Sum-Product Networks},
journal = {{IEEE} Trans. Pattern Anal. Mach. Intell.},
volume = {39},
number = {10},
pages = {2030--2044},
year = {2017}
}
-
A Unified Approach for Learning the Parameters of Sum-Product Networks[ paper | code | bibtex | abstract ]
We present a unified approach for learning the parameters of Sum-Product networks (SPNs). We prove that any complete and decomposable SPN is equivalent to a mixture of trees where each tree corresponds to a product of univariate distributions. Based on the mixture model perspective, we characterize the objective function when learning SPNs based on the maximum likelihood estimation (MLE) principle and show that the optimization problem can be formulated as a signomial program. We construct two parameter learning algorithms for SPNs by using sequential monomial approximations (SMA) and the concave-convex procedure (CCCP), respectively. The two proposed methods naturally admit multiplicative updates, hence effectively avoiding the projection operation. With the help of the unified framework, we also show that, in the case of SPNs, CCCP leads to the same algorithm as Expectation Maximization (EM) despite the fact that they are different in general.
@inproceedings{zhao2016unified,
author = {Zhao, Han and Poupart, Pascal and Gordon, Geoffrey J.},
title = {A Unified Approach for Learning the Parameters of Sum-Product Networks},
booktitle = {{NIPS}},
pages = {433--441},
year = {2016}
}
-
Simplifying, Regularizing and Strengthening Sum-Product Network Structure Learning[ paper | supplemental | code | bibtex | abstract ]
The need for feasible inference in Probabilistic Graphical Models (PGMs) has lead to tractable models like Sum-Product Networks (SPNs). Their highly expressive power and their ability to provide exact and tractable inference make them very attractive for several real world applications, from computer vision to NLP. Recently, great attention around SPNs has focused on structure learning, leading to different algorithms being able to learn both the network and its parameters from data. Here, we enhance one of the best structure learner, LearnSPN, aiming to improve both the structural quality of the learned networks and their achieved likelihoods. Our algorithmic variations are able to learn simpler, deeper and more robust networks. These results have been obtained by exploiting some insights in the building process done by LearnSPN, by hybridizing the network adopting tree-structured models as leaves, and by blending bagging estimations into mixture creation. We prove our claims by empirically evaluating the learned SPNs on several benchmark datasets against other competitive SPN and PGM structure learners.
@inproceedings{vergari2015simplifying,
title={Simplifying, regularizing and strengthening sum-product network structure learning},
author={Vergari, Antonio and Di Mauro, Nicola and Esposito, Floriana},
booktitle={Joint European Conference on Machine Learning and Knowledge Discovery in Databases},
pages={343--358},
year={2015},
organization={Springer}
} -
Learning Accurate Cutset Networks by Exploiting Decomposability[ paper | code | bibtex | abstract ]
The rising interest around tractable Probabilistic Graphical Models is due to the guarantees on inference feasibility they provide. Among them, Cutset Networks (CNets) have recently been introduced as models embedding Pearl’s cutset conditioning algorithm in the form of weighted probabilistic model trees with tree-structured models as leaves. Learning the structure of CNets has been tackled as a greedy search leveraging heuristics from decision tree learning. Even if efficient, the learned models are far from being accurate in terms of likelihood. Here, we exploit the decomposable score of CNets to learn their structure and parameters by directly maximizing the likelihood, including the BIC criterion and informative priors on smoothing parameters. In addition, we show how to create mixtures of CNets by adopting a well known bagging method from the discriminative framework as an effective and cheap alternative to the classical EM. We compare our algorithms against the original variants on a set of standard benchmarks for graphical model structure learning, empirically proving our claims.
@inproceedings{di2015learning,
title={Learning accurate cutset networks by exploiting decomposability},
author={Di Mauro, Nicola and Vergari, Antonio and Esposito, Floriana},
booktitle={Congress of the Italian Association for Artificial Intelligence},
pages={221--232},
year={2015},
organization={Springer}} -
Language Modeling with Sum-Product Networks[ paper | code | bibtex | abstract ]
Sum product networks (SPNs) are a new class of deep probabilistic models. They can contain multiple hidden layers while keeping their inference and training times tractable. An SPN consists of interleaving layers of sum nodes and product nodes. A sum node can be interpreted as a hidden variable, and a product node can be viewed as a feature capturing rich interactions among an SPN's inputs. We show that the ability of SPN to use hidden layers to model complex dependencies among words, and its tractable inference and learning times, make it a suitable framework for a language model. Even though SPNs have been applied to a variety of vision problems, we are the first to use it for language modeling. Our empirical comparisons with six previous language models indicate that our SPN has superior performance.
@inproceedings{cheng2014language, title={Language modeling with sum-product networks}, author={Cheng, Wei-Chen and Kok, Stanley and Pham, Hoai Vu and Chieu, Hai Leong and Chai, Kian Ming A.}, booktitle={Fifteenth Annual Conference of the International Speech Communication Association}, year={2014} }
-
Modeling Speech with Sum-Product Networks: Application to Bandwidth Extension[ paper | code | bibtex | abstract ]
Sum-product networks (SPNs) are a recently proposed type of probabilistic graphical models allowing complex variable interactions while still granting efficient inference. In this paper we demonstrate the suitability of SPNs for modeling log-spectra of speech signals using the application of artificial bandwidth extension, i.e. artificially replacing the high-frequency content which is lost in telephone signals. We use SPNs as observation models in hidden Markov models (HMMs), which model the temporal evolution of log short-time spectra. Missing frequency bins are replaced by the SPNs using most-probable-explanation inference, where the state-dependent reconstructions are weighted with the HMM state posterior.
@inproceedings{peharz2014modeling,
title={Modeling speech with sum-product networks: Application to bandwidth extension},
author={Peharz, Robert and Kapeller, Georg and Mowlaee, Pejman and Pernkopf, Franz},
booktitle={2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
pages={3699--3703},
year={2014},
organization={IEEE}
} -
Learning Sum-Product Networks with Direct and Indirect Interactions[ paper | code | bibtex | abstract ]
Sum-product networks (SPNs) are a deep probabilistic representation that allows for efficient, exact inference. SPNs generalize many other tractable models, including thin junction trees, latent tree models, and many types of mixtures. Previous work on learning SPN structure has mainly focused on using top-down or bottom-up clustering to find mixtures, which capture variable interactions indirectly through implicit latent variables. In contrast, most work on learning graphical models, thin junction trees, and arithmetic circuits has focused on finding direct interactions among variables. In this paper, we present ID-SPN, a new algorithm for learning SPN structure that unifies the two approaches. In experiments on 20 benchmark datasets, we find that the combination of direct and indirect interactions leads to significantly better accuracy than several state-of-the-art algorithms for learning SPNs and other tractable models.
@inproceedings{rooshenas2014learning,
title={Learning sum-product networks with direct and indirect interactions},
author={Rooshenas, Amirmohammad and Lowd, Daniel},
booktitle={In Proceedings of the Thirty-First International Conference on Machine Learning}, year={2014},
organization={Citeseer}
} -
Learning the Structure of Sum-Product Networks[ paper | code | bibtex | abstract ]
Sum-product networks (SPNs) are a new class of deep probabilistic models. SPNs can have unbounded treewidth but inference in them is always tractable. An SPN is either a univariate distribution, a product of SPNs over disjoint variables, or a weighted sum of SPNs over the same variables. We propose the first algorithm for learning the structure of SPNs that takes full advantage of their expressiveness. At each step, the algorithm attempts to divide the current variables into approximately independent subsets. If successful, it returns the product of recursive calls on the subsets; otherwise it returns the sum of recursive calls on subsets of similar instances from the current training set. A comprehensive empirical study shows that the learned SPNs are typically comparable to graphical models in likelihood but superior in inference speed and accuracy.
@inproceedings{gens2013learning,
title={Learning the structure of sum-product networks},
author={Gens, Robert and Pedro, Domingos},
booktitle={International conference on machine learning},
pages={873--880},
year={2013}
} -
Sum-Product Networks: a New Deep Architecture[ paper | code | bibtex | abstract ]
The key limiting factor in graphical model inference and learning is the complexity of the partition function. We thus ask the question: what are general conditions under which the partition function is tractable? The answer leads to a new kind of deep architecture, which we call sum-product networks (SPNs). SPNs are directed acyclic graphs with variables as leaves, sums and products as internal nodes, and weighted edges. We show that if an SPN is complete and consistent it represents the partition function and all marginals of some graphical model, and give semantics to its nodes. Essentially all tractable graphical models can be cast as SPNs, but SPNs are also strictly more general. We then propose learning algorithms for SPNs, based onbackpropagation and EM. Experiments show that inference and learning with SPNs can be both faster and more accurate than with standard deep networks. For example, SPNs perform image completion better than state-of-the-art deep networks for this task. SPNs also have intriguing potential connections to the architecture of the cortex.
@inproceedings{poon2011sum, title={Sum-product networks: A new deep architecture}, author={Poon, Hoifung and Domingos, Pedro}, booktitle={2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops)}, pages={689--690}, year={2011}, organization={IEEE} }