TY  - JOUR
AB  - Given a fixed finite metric space (V,μ), the {\em minimum 0-extension problem}, denoted as 0-Ext[μ], is equivalent to the following optimization problem: minimize function of the form minx∈Vn∑ifi(xi)+∑ijcijμ(xi,xj) where cij,cvi are given nonnegative costs and fi:V→R are functions given by fi(xi)=∑v∈Vcviμ(xi,v). The computational complexity of 0-Ext[μ] has been recently established by Karzanov and by Hirai: if metric μ is {\em orientable modular} then 0-Ext[μ] can be solved in polynomial time, otherwise 0-Ext[μ] is NP-hard. To prove the tractability part, Hirai developed a theory of discrete convex functions on orientable modular graphs generalizing several known classes of functions in discrete convex analysis, such as L♮-convex functions. We consider a more general version of the problem in which unary functions fi(xi) can additionally have terms of the form cuv;iμ(xi,{u,v}) for {u,v}∈F, where set F⊆(V2) is fixed. We extend the complexity classification above by providing an explicit condition on (μ,F) for the problem to be tractable. In order to prove the tractability part, we generalize Hirai's theory and define a larger class of discrete convex functions. It covers, in particular, another well-known class of functions, namely submodular functions on an integer lattice. Finally, we improve the complexity of Hirai's algorithm for solving 0-Ext on orientable modular graphs.

AU  - Dvorak, Martin
AU  - Kolmogorov, Vladimir
ID  - 10045
JF  - Mathematical Programming
KW  - minimum 0-extension problem
KW  - metric labeling problem
KW  - discrete metric spaces
KW  - metric extensions
KW  - computational complexity
KW  - valued constraint satisfaction problems
KW  - discrete convex analysis
KW  - L-convex functions
SN  - 0025-5610
TI  - Generalized minimum 0-extension problem and discrete convexity
ER  - 
TY  - CONF
AB  - A central problem in computational statistics is to convert a procedure for sampling combinatorial objects into a procedure for counting those objects, and vice versa. We will consider sampling problems which come from Gibbs distributions, which are families of probability distributions over a discrete space Ω with probability mass function of the form μ^Ω_β(ω) ∝ e^{β H(ω)} for β in an interval [β_min, β_max] and H(ω) ∈ {0} ∪ [1, n].
The partition function is the normalization factor Z(β) = ∑_{ω ∈ Ω} e^{β H(ω)}, and the log partition ratio is defined as q = (log Z(β_max))/Z(β_min)
We develop a number of algorithms to estimate the counts c_x using roughly Õ(q/ε²) samples for general Gibbs distributions and Õ(n²/ε²) samples for integer-valued distributions (ignoring some second-order terms and parameters), We show this is optimal up to logarithmic factors. We illustrate with improved algorithms for counting connected subgraphs and perfect matchings in a graph.
AU  - Harris, David G.
AU  - Kolmogorov, Vladimir
ID  - 14084
SN  - 1868-8969
T2  - 50th International Colloquium on Automata, Languages, and Programming
TI  - Parameter estimation for Gibbs distributions
VL  - 261
ER  - 
TY  - CONF
AB  - We formalized general (i.e., type-0) grammars using the Lean 3 proof assistant. We defined basic notions of rewrite rules and of words derived by a grammar, and used grammars to show closure of the class of type-0 languages under four operations: union, reversal, concatenation, and the Kleene star. The literature mostly focuses on Turing machine arguments, which are possibly more difficult to formalize. For the Kleene star, we could not follow the literature and came up with our own grammar-based construction.
AU  - Dvorak, Martin
AU  - Blanchette, Jasmin
ID  - 13120
SN  - 9783959772846
T2  - 14th International Conference on Interactive Theorem Proving
TI  - Closure properties of general grammars - formally verified
VL  - 268
ER  - 
TY  - CONF
AB  - We consider the problem of solving LP relaxations of MAP-MRF inference problems, and in particular the method proposed recently in [16], [35]. As a key computational subroutine, it uses a variant of the Frank-Wolfe (FW) method to minimize a smooth convex function over a combinatorial polytope. We propose an efficient implementation of this subroutine based on in-face Frank-Wolfe directions, introduced in [4] in a different context. More generally, we define an abstract data structure for a combinatorial subproblem that enables in-face FW directions, and describe its specialization for tree-structured MAP-MRF inference subproblems. Experimental results indicate that the resulting method is the current state-of-art LP solver for some classes of problems. Our code is available at pub.ist.ac.at/~vnk/papers/IN-FACE-FW.html.
AU  - Kolmogorov, Vladimir
ID  - 14448
SN  - 1063-6919
T2  - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
TI  - Solving relaxations of MAP-MRF problems: Combinatorial in-face Frank-Wolfe directions
VL  - 2023
ER  - 
TY  - JOUR
AB  - We consider two models for the sequence labeling (tagging) problem. The first one is a Pattern-Based Conditional Random Field (PB), in which the energy of a string (chain labeling) x=x1⁢…⁢xn∈Dn is a sum of terms over intervals [i,j] where each term is non-zero only if the substring xi⁢…⁢xj equals a prespecified word w∈Λ. The second model is a Weighted Context-Free Grammar (WCFG) frequently used for natural language processing. PB and WCFG encode local and non-local interactions respectively, and thus can be viewed as complementary. We propose a Grammatical Pattern-Based CRF model (GPB) that combines the two in a natural way. We argue that it has certain advantages over existing approaches such as the Hybrid model of Benedí and Sanchez that combines N-grams and WCFGs. The focus of this paper is to analyze the complexity of inference tasks in a GPB such as computing MAP. We present a polynomial-time algorithm for general GPBs and a faster version for a special case that we call Interaction Grammars.
AU  - Takhanov, Rustem
AU  - Kolmogorov, Vladimir
ID  - 10737
IS  - 1
JF  - Intelligent Data Analysis
SN  - 1088-467X
TI  - Combining pattern-based CRFs and weighted context-free grammars
VL  - 26
ER  - 
TY  - JOUR
AB  - Weak convergence of inertial iterative method for solving variational inequalities is the focus of this paper. The cost function is assumed to be non-Lipschitz and monotone. We propose a projection-type method with inertial terms and give weak convergence analysis under appropriate conditions. Some test results are performed and compared with relevant methods in the literature to show the efficiency and advantages given by our proposed methods.
AU  - Shehu, Yekini
AU  - Iyiola, Olaniyi S.
ID  - 7577
IS  - 1
JF  - Applicable Analysis
SN  - 0003-6811
TI  - Weak convergence for variational inequalities with inertial-type method
VL  - 101
ER  - 
TY  - CONF
AB  - The Lovász Local Lemma (LLL) is a powerful tool in probabilistic combinatorics which can be used to establish the existence of objects that satisfy certain properties. The breakthrough paper of Moser and Tardos and follow-up works revealed that the LLL has intimate connections with a class of stochastic local search algorithms for finding such desirable objects. In particular, it can be seen as a sufficient condition for this type of algorithms to converge fast. Besides conditions for existence of and fast convergence to desirable objects, one may naturally ask further questions regarding properties of these algorithms. For instance, "are they parallelizable?", "how many solutions can they output?", "what is the expected "weight" of a solution?", etc. These questions and more have been answered for a class of LLL-inspired algorithms called commutative. In this paper we introduce a new, very natural and more general notion of commutativity (essentially matrix commutativity) which allows us to show a number of new refined properties of LLL-inspired local search algorithms with significantly simpler proofs.
AU  - Harris, David G.
AU  - Iliopoulos, Fotis
AU  - Kolmogorov, Vladimir
ID  - 10072
SN  - 1868-8969
T2  - Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques
TI  - A new notion of commutativity for the algorithmic Lovász Local Lemma
VL  - 207
ER  - 
TY  - CONF
AB  - We study a class of convex-concave saddle-point problems of the form minxmaxy⟨Kx,y⟩+fP(x)−h∗(y) where K is a linear operator, fP is the sum of a convex function f with a Lipschitz-continuous gradient and the indicator function of a bounded convex polytope P, and h∗ is a convex (possibly nonsmooth) function. Such problem arises, for example, as a Lagrangian relaxation of various discrete optimization problems. Our main assumptions are the existence of an efficient linear minimization oracle (lmo) for fP and an efficient proximal map for h∗ which motivate the solution via a blend of proximal primal-dual algorithms and Frank-Wolfe algorithms. In case h∗ is the indicator function of a linear constraint and function f is quadratic, we show a O(1/n2) convergence rate on the dual objective, requiring O(nlogn) calls of lmo. If the problem comes from the constrained optimization problem minx∈Rd{fP(x)|Ax−b=0} then we additionally get bound O(1/n2) both on the primal gap and on the infeasibility gap. In the most general case, we show a O(1/n) convergence rate of the primal-dual gap again requiring O(nlogn) calls of lmo. To the best of our knowledge, this improves on the known convergence rates for the considered class of saddle-point problems. We show applications to labeling problems frequently appearing in machine learning and computer vision.
AU  - Kolmogorov, Vladimir
AU  - Pock, Thomas
ID  - 10552
T2  - 38th International Conference on Machine Learning
TI  - One-sided Frank-Wolfe algorithms for saddle problems
ER  - 
TY  - CONF
AB  - The convex grabbing game is a game where two players, Alice and Bob, alternate taking extremal points from the convex hull of a point set on the plane. Rational weights are given to the points. The goal of each player is to maximize the total weight over all points that they obtain. We restrict the setting to the case of binary weights. We show a construction of an arbitrarily large odd-sized point set that allows Bob to obtain almost 3/4 of the total weight. This construction answers a question asked by Matsumoto, Nakamigawa, and Sakuma in [Graphs and Combinatorics, 36/1 (2020)]. We also present an arbitrarily large even-sized point set where Bob can obtain the entirety of the total weight. Finally, we discuss conjectures about optimum moves in the convex grabbing game for both players in general.
AU  - Dvorak, Martin
AU  - Nicholson, Sara
ID  - 9592
KW  - convex grabbing game
KW  - graph grabbing game
KW  - combinatorial game
KW  - convex geometry
T2  - Proceedings of the 33rd Canadian Conference on Computational Geometry
TI  - Massively winning configurations in the convex grabbing game on the plane
ER  - 
TY  - JOUR
AB  - In this paper, we consider reflected three-operator splitting methods for monotone inclusion problems in real Hilbert spaces. To do this, we first obtain weak convergence analysis and nonasymptotic O(1/n) convergence rate of the reflected Krasnosel'skiĭ-Mann iteration for finding a fixed point of nonexpansive mapping in real Hilbert spaces under some seemingly easy to implement conditions on the iterative parameters. We then apply our results to three-operator splitting for the monotone inclusion problem and consequently obtain the corresponding convergence analysis. Furthermore, we derive reflected primal-dual algorithms for highly structured monotone inclusion problems. Some numerical implementations are drawn from splitting methods to support the theoretical analysis.
AU  - Iyiola, Olaniyi S.
AU  - Enyi, Cyril D.
AU  - Shehu, Yekini
ID  - 9469
JF  - Optimization Methods and Software
SN  - 1055-6788
TI  - Reflected three-operator splitting method for monotone inclusion problem
ER  - 
TY  - JOUR
AB  - In this paper, we present two new inertial projection-type methods for solving multivalued variational inequality problems in finite-dimensional spaces. We establish the convergence of the sequence generated by these methods when the multivalued mapping associated with the problem is only required to be locally bounded without any monotonicity assumption. Furthermore, the inertial techniques that we employ in this paper are quite different from the ones used in most papers. Moreover, based on the weaker assumptions on the inertial factor in our methods, we derive several special cases of our methods. Finally, we present some experimental results to illustrate the profits that we gain by introducing the inertial extrapolation steps.
AU  - Izuchukwu, Chinedu
AU  - Shehu, Yekini
ID  - 9234
IS  - 2
JF  - Networks and Spatial Economics
KW  - Computer Networks and Communications
KW  - Software
KW  - Artificial Intelligence
SN  - 1566-113X
TI  - New inertial projection methods for solving multivalued variational inequality problems beyond monotonicity
VL  - 21
ER  - 
TY  - CONF
AB  - In the multiway cut problem we are given a weighted undirected graph   G=(V,E)  and a set   T⊆V  of k terminals. The goal is to find a minimum weight set of edges   E′⊆E  with the property that by removing   E′  from G all the terminals become disconnected. In this paper we present a simple local search approximation algorithm for the multiway cut problem with approximation ratio   2−2k . We present an experimental evaluation of the performance of our local search algorithm and show that it greatly outperforms the isolation heuristic of Dalhaus et al. and it has similar performance as the much more complex algorithms of Calinescu et al., Sharma and Vondrak, and Buchbinder et al. which have the currently best known approximation ratios for this problem.
AU  - Bloch-Hansen, Andrew
AU  - Samei, Nasim
AU  - Solis-Oba, Roberto
ID  - 9227
SN  - 0302-9743
T2  - Conference on Algorithms and Discrete Applied Mathematics
TI  - Experimental evaluation of a local search approximation algorithm for the multiway cut problem
VL  - 12601
ER  - 
TY  - JOUR
AB  - The paper introduces an inertial extragradient subgradient method with self-adaptive step sizes for solving equilibrium problems in real Hilbert spaces. Weak convergence of the proposed method is obtained under the condition that the bifunction is pseudomonotone and Lipchitz continuous. Linear convergence is also given when the bifunction is strongly pseudomonotone and Lipchitz continuous. Numerical implementations and comparisons with other related inertial methods are given using test problems including a real-world application to Nash–Cournot oligopolistic electricity market equilibrium model.
AU  - Shehu, Yekini
AU  - Iyiola, Olaniyi S.
AU  - Thong, Duong Viet
AU  - Van, Nguyen Thi Cam
ID  - 8817
IS  - 2
JF  - Mathematical Methods of Operations Research
SN  - 1432-2994
TI  - An inertial subgradient extragradient algorithm extended to pseudomonotone equilibrium problems
VL  - 93
ER  - 
TY  - JOUR
AB  - We consider inertial iteration methods for Fermat–Weber location problem and primal–dual three-operator splitting in real Hilbert spaces. To do these, we first obtain weak convergence analysis and nonasymptotic O(1/n) convergence rate of the inertial Krasnoselskii–Mann iteration for fixed point of nonexpansive operators in infinite dimensional real Hilbert spaces under some seemingly easy to implement conditions on the iterative parameters. One of our contributions is that the convergence analysis and rate of convergence results are obtained using conditions which appear not complicated and restrictive as assumed in other previous related results in the literature. We then show that Fermat–Weber location problem and primal–dual three-operator splitting are special cases of fixed point problem of nonexpansive mapping and consequently obtain the convergence analysis of inertial iteration methods for Fermat–Weber location problem and primal–dual three-operator splitting in real Hilbert spaces. Some numerical implementations are drawn from primal–dual three-operator splitting to support the theoretical analysis.
AU  - Iyiola, Olaniyi S.
AU  - Shehu, Yekini
ID  - 9315
IS  - 2
JF  - Results in Mathematics
SN  - 1422-6383
TI  - New convergence results for inertial Krasnoselskii–Mann iterations in Hilbert spaces with applications
VL  - 76
ER  - 
TY  - JOUR
AB  - In this paper, we propose a new iterative method with alternated inertial step for solving split common null point problem in real Hilbert spaces. We obtain weak convergence of the proposed iterative algorithm. Furthermore, we introduce the notion of bounded linear regularity property for the split common null point problem and obtain the linear convergence property for the new algorithm under some mild assumptions. Finally, we provide some numerical examples to demonstrate the performance and efficiency of the proposed method.
AU  - Ogbuisi, Ferdinard U.
AU  - Shehu, Yekini
AU  - Yao, Jen Chih
ID  - 9365
JF  - Optimization
SN  - 0233-1934
TI  - Convergence analysis of new inertial method for the split common null point problem
ER  - 
TY  - JOUR
AB  - This paper aims to obtain a strong convergence result for a Douglas–Rachford splitting method with inertial extrapolation step for finding a zero of the sum of two set-valued maximal monotone operators without any further assumption of uniform monotonicity on any of the involved maximal monotone operators. Furthermore, our proposed method is easy to implement and the inertial factor in our proposed method is a natural choice. Our method of proof is of independent interest. Finally, some numerical implementations are given to confirm the theoretical analysis.
AU  - Shehu, Yekini
AU  - Dong, Qiao-Li
AU  - Liu, Lu-Lu
AU  - Yao, Jen-Chih
ID  - 8196
JF  - Optimization and Engineering
SN  - 1389-4420
TI  - New strong convergence method for the sum of two maximal monotone operators
VL  - 22
ER  - 
TY  - JOUR
AB  - In this paper, we introduce a relaxed CQ method with alternated inertial step for solving split feasibility problems. We give convergence of the sequence generated by our method under some suitable assumptions. Some numerical implementations from sparse signal and image deblurring are reported to show the efficiency of our method.
AU  - Shehu, Yekini
AU  - Gibali, Aviv
ID  - 7925
JF  - Optimization Letters
SN  - 1862-4472
TI  - New inertial relaxed method for solving split feasibilities
VL  - 15
ER  - 
TY  - JOUR
AB  - We consider the monotone variational inequality problem in a Hilbert space and describe a projection-type method with inertial terms under the following properties: (a) The method generates a strongly convergent iteration sequence; (b) The method requires, at each iteration, only one projection onto the feasible set and two evaluations of the operator; (c) The method is designed for variational inequality for which the underline operator is monotone and uniformly continuous; (d) The method includes an inertial term. The latter is also shown to speed up the convergence in our numerical results. A comparison with some related methods is given and indicates that the new method is promising.
AU  - Shehu, Yekini
AU  - Li, Xiao-Huan
AU  - Dong, Qiao-Li
ID  - 6593
JF  - Numerical Algorithms
SN  - 1017-1398
TI  - An efficient projection-type method for monotone variational inequalities in Hilbert spaces
VL  - 84
ER  - 
TY  - JOUR
AB  - The projection methods with vanilla inertial extrapolation step for variational inequalities have been of interest to many authors recently due to the improved convergence speed contributed by the presence of inertial extrapolation step. However, it is discovered that these projection methods with inertial steps lose the Fejér monotonicity of the iterates with respect to the solution, which is being enjoyed by their corresponding non-inertial projection methods for variational inequalities. This lack of Fejér monotonicity makes projection methods with vanilla inertial extrapolation step for variational inequalities not to converge faster than their corresponding non-inertial projection methods at times. Also, it has recently been proved that the projection methods with vanilla inertial extrapolation step may provide convergence rates that are worse than the classical projected gradient methods for strongly convex functions. In this paper, we introduce projection methods with alternated inertial extrapolation step for solving variational inequalities. We show that the sequence of iterates generated by our methods converges weakly to a solution of the variational inequality under some appropriate conditions. The Fejér monotonicity of even subsequence is recovered in these methods and linear rate of convergence is obtained. The numerical implementations of our methods compared with some other inertial projection methods show that our method is more efficient and outperforms some of these inertial projection methods.
AU  - Shehu, Yekini
AU  - Iyiola, Olaniyi S.
ID  - 8077
JF  - Applied Numerical Mathematics
SN  - 0168-9274
TI  - Projection methods with alternating inertial steps for variational inequalities: Weak and linear convergence
VL  - 157
ER  - 
TY  - JOUR
AB  - In this paper, we introduce an inertial projection-type method with different updating strategies for solving quasi-variational inequalities with strongly monotone and Lipschitz continuous operators in real Hilbert spaces. Under standard assumptions, we establish different strong convergence results for the proposed algorithm. Primary numerical experiments demonstrate the potential applicability of our scheme compared with some related methods in the literature.
AU  - Shehu, Yekini
AU  - Gibali, Aviv
AU  - Sagratella, Simone
ID  - 7161
JF  - Journal of Optimization Theory and Applications
SN  - 0022-3239
TI  - Inertial projection-type methods for solving quasi-variational inequalities in real Hilbert spaces
VL  - 184
ER  - 
TY  - CONF
AB  - A Valued Constraint Satisfaction Problem (VCSP) provides a common framework that can express a wide range of discrete optimization problems. A VCSP instance is given by a finite set of variables, a finite domain of labels, and an objective function to be minimized. This function is represented as a sum of terms where each term depends on a subset of the variables. To obtain different classes of optimization problems, one can restrict all terms to come from a fixed set Γ of cost functions, called a language. 
Recent breakthrough results have established a complete complexity classification of such classes with respect to language Γ: if all cost functions in Γ satisfy a certain algebraic condition then all Γ-instances can be solved in polynomial time, otherwise the problem is NP-hard. Unfortunately, testing this condition for a given language Γ is known to be NP-hard. We thus study exponential algorithms for this meta-problem. We show that the tractability condition of a finite-valued language Γ can be tested in O(3‾√3|D|⋅poly(size(Γ))) time, where D is the domain of Γ and poly(⋅) is some fixed polynomial. We also obtain a matching lower bound under the Strong Exponential Time Hypothesis (SETH). More precisely, we prove that for any constant δ<1 there is no O(3‾√3δ|D|) algorithm, assuming that SETH holds.
AU  - Kolmogorov, Vladimir
ID  - 6725
SN  - 1868-8969
T2  - 46th International Colloquium on Automata, Languages and Programming
TI  - Testing the complexity of a valued CSP language
VL  - 132
ER  - 
TY  - JOUR
AB  - It is well known that many problems in image recovery, signal processing, and machine learning can be modeled as finding zeros of the sum of maximal monotone and Lipschitz continuous monotone operators. Many papers have studied forward-backward splitting methods for finding zeros of the sum of two monotone operators in Hilbert spaces. Most of the proposed splitting methods in the literature have been proposed for the sum of maximal monotone and inverse-strongly monotone operators in Hilbert spaces. In this paper, we consider splitting methods for finding zeros of the sum of maximal monotone operators and Lipschitz continuous monotone operators in Banach spaces. We obtain weak and strong convergence results for the zeros of the sum of maximal monotone and Lipschitz continuous monotone operators in Banach spaces. Many already studied problems in the literature can be considered as special cases of this paper.
AU  - Shehu, Yekini
ID  - 6596
IS  - 4
JF  - Results in Mathematics
SN  - 1422-6383
TI  - Convergence results of forward-backward algorithms for sum of monotone operators in Banach spaces
VL  - 74
ER  - 
TY  - JOUR
AB  - The main contributions of this paper are the proposition and the convergence analysis of a class of inertial projection-type algorithm for solving variational inequality problems in real Hilbert spaces where the underline operator is monotone and uniformly continuous. We carry out a unified analysis of the proposed method under very mild assumptions. In particular, weak convergence of the generated sequence is established and nonasymptotic O(1 / n) rate of convergence is established, where n denotes the iteration counter. We also present some experimental results to illustrate the profits gained by introducing the inertial extrapolation steps.
AU  - Shehu, Yekini
AU  - Iyiola, Olaniyi S.
AU  - Li, Xiao-Huan
AU  - Dong, Qiao-Li
ID  - 7000
IS  - 4
JF  - Computational and Applied Mathematics
SN  - 2238-3603
TI  - Convergence analysis of projection method for variational inequalities
VL  - 38
ER  - 
TY  - JOUR
AB  - We develop a framework for the rigorous analysis of focused stochastic local search algorithms. These algorithms search a state space by repeatedly selecting some constraint that is violated in the current state and moving to a random nearby state that addresses the violation, while (we hope) not introducing many new violations. An important class of focused local search algorithms with provable performance guarantees has recently arisen from algorithmizations of the Lovász local lemma (LLL), a nonconstructive tool for proving the existence of satisfying states by introducing a background measure on the state space. While powerful, the state transitions of algorithms in this class must be, in a precise sense, perfectly compatible with the background measure. In many applications this is a very restrictive requirement, and one needs to step outside the class. Here we introduce the notion of measure distortion and develop a framework for analyzing arbitrary focused stochastic local search algorithms, recovering LLL algorithmizations as the special case of no distortion. Our framework takes as input an arbitrary algorithm of such type and an arbitrary probability measure and shows how to use the measure as a yardstick of algorithmic progress, even for algorithms designed independently of the measure.
AU  - Achlioptas, Dimitris
AU  - Iliopoulos, Fotis
AU  - Kolmogorov, Vladimir
ID  - 7412
IS  - 5
JF  - SIAM Journal on Computing
SN  - 0097-5397
TI  - A local lemma for focused stochastical algorithms
VL  - 48
ER  - 
TY  - CONF
AB  - We present a new proximal bundle method for Maximum-A-Posteriori (MAP) inference in structured energy minimization problems. The method optimizes a Lagrangean relaxation of the original energy minimization problem using a multi plane block-coordinate Frank-Wolfe method that takes advantage of the specific structure of the Lagrangean decomposition. We show empirically that our method outperforms state-of-the-art Lagrangean decomposition based algorithms on some challenging Markov Random Field, multi-label discrete tomography and graph matching problems.
AU  - Swoboda, Paul
AU  - Kolmogorov, Vladimir
ID  - 7468
SN  - 10636919
T2  - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
TI  - Map inference via block-coordinate Frank-Wolfe algorithm
VL  - 2019-June
ER  - 
TY  - CONF
AB  - Deep neural networks (DNNs) have become increasingly important due to their excellent empirical performance on a wide range of problems. However, regularization is generally achieved by indirect means, largely due to the complex set of functions defined by a network and the difficulty in measuring function complexity. There exists no method in the literature for additive regularization based on a norm of the function, as is classically considered in statistical learning theory. In this work, we study the tractability of function norms for deep neural networks with ReLU activations. We provide, to the best of our knowledge, the first proof in the literature of the NP-hardness of computing function norms of DNNs of 3 or more layers. We also highlight a fundamental difference between shallow and deep networks. In the light on these results, we propose a new regularization strategy based on approximate function norms, and show its efficiency on a segmentation task with a DNN.
AU  - Rannen-Triki, Amal
AU  - Berman, Maxim
AU  - Kolmogorov, Vladimir
AU  - Blaschko, Matthew B.
ID  - 7639
SN  - 9781728150239
T2  - Proceedings of the 2019 International Conference on Computer Vision Workshop
TI  - Function norms for neural networks
ER  - 
TY  - JOUR
AB  - We consider the NP-hard problem of MAP-inference for undirected discrete graphical models. We propose a polynomial time and practically efficient algorithm for finding a part of its optimal solution. Specifically, our algorithm marks some labels of the considered graphical model either as (i) optimal, meaning that they belong to all optimal solutions of the inference problem; (ii) non-optimal if they provably do not belong to any solution. With access to an exact solver of a linear programming relaxation to the MAP-inference problem, our algorithm marks the maximal possible (in a specified sense) number of labels. We also present a version of the algorithm, which has access to a suboptimal dual solver only and still can ensure the (non-)optimality for the marked labels, although the overall number of the marked labels may decrease. We propose an efficient implementation, which runs in time comparable to a single run of a suboptimal dual solver. Our method is well-scalable and shows state-of-the-art results on computational benchmarks from machine learning and computer vision.
AU  - Shekhovtsov, Alexander
AU  - Swoboda, Paul
AU  - Savchynskyy, Bogdan
ID  - 703
IS  - 7
JF  - IEEE Transactions on Pattern Analysis and Machine Intelligence
SN  - 01628828
TI  - Maximum persistency via iterative relaxed inference with graphical models
VL  - 40
ER  - 
TY  - CHAP
AB  - We prove that every congruence distributive variety has directed Jónsson terms, and every congruence modular variety has directed Gumm terms. The directed terms we construct witness every case of absorption witnessed by the original Jónsson or Gumm terms. This result is equivalent to a pair of claims about absorption for admissible preorders in congruence distributive and congruence modular varieties, respectively. For finite algebras, these absorption theorems have already seen significant applications, but until now, it was not clear if the theorems hold for general algebras as well. Our method also yields a novel proof of a result by P. Lipparini about the existence of a chain of terms (which we call Pixley terms) in varieties that are at the same time congruence distributive and k-permutable for some k.
AU  - Kazda, Alexandr
AU  - Kozik, Marcin
AU  - McKenzie, Ralph
AU  - Moore, Matthew
ED  - Czelakowski, J
ID  - 10864
SN  - 2211-2758
T2  - Don Pigozzi on Abstract Algebraic Logic, Universal Algebra, and Computer Science
TI  - Absorption and directed Jónsson terms
VL  - 16
ER  - 
TY  - CONF
AB  - The accuracy of information retrieval systems is often measured using complex loss functions such as the average precision (AP) or the normalized discounted cumulative gain (NDCG). Given a set of positive and negative samples, the parameters of a retrieval system can be estimated by minimizing these loss functions. However, the non-differentiability and non-decomposability of these loss functions does not allow for simple gradient based optimization algorithms. This issue is generally circumvented by either optimizing a structured hinge-loss upper bound to the loss function or by using asymptotic methods like the direct-loss minimization framework. Yet, the high computational complexity of loss-augmented inference, which is necessary for both the frameworks, prohibits its use in large training data sets. To alleviate this deficiency, we present a novel quicksort flavored algorithm for a large class of non-decomposable loss functions. We provide a complete characterization of the loss functions that are amenable to our algorithm, and show that it includes both AP and NDCG based loss functions. Furthermore, we prove that no comparison based algorithm can improve upon the computational complexity of our approach asymptotically. We demonstrate the effectiveness of our approach in the context of optimizing the structured hinge loss upper bound of AP and NDCG loss for learning models for a variety of vision tasks. We show that our approach provides significantly better results than simpler decomposable loss functions, while requiring a comparable training time.
AU  - Mohapatra, Pritish
AU  - Rolinek, Michal
AU  - Jawahar, C V
AU  - Kolmogorov, Vladimir
AU  - Kumar, M Pawan
ID  - 273
SN  - 9781538664209
T2  - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
TI  - Efficient optimization for rank-based loss functions
ER  - 
TY  - CONF
AB  - We show attacks on five data-independent memory-hard functions (iMHF) that were submitted to the password hashing competition (PHC). Informally, an MHF is a function which cannot be evaluated on dedicated hardware, like ASICs, at significantly lower hardware and/or energy cost than evaluating a single instance on a standard single-core architecture. Data-independent means the memory access pattern of the function is independent of the input; this makes iMHFs harder to construct than data-dependent ones, but the latter can be attacked by various side-channel attacks. Following [Alwen-Blocki'16], we capture the evaluation of an iMHF as a directed acyclic graph (DAG). The cumulative parallel pebbling complexity of this DAG is a measure for the hardware cost of evaluating the iMHF on an ASIC. Ideally, one would like the complexity of a DAG underlying an iMHF to be as close to quadratic in the number of nodes of the graph as possible. Instead, we show that (the DAGs underlying) the following iMHFs are far from this bound: Rig.v2, TwoCats and Gambit each having an exponent no more than 1.75. Moreover, we show that the complexity of the iMHF modes of the PHC finalists Pomelo and Lyra2 have exponents at most 1.83 and 1.67 respectively. To show this we investigate a combinatorial property of each underlying DAG (called its depth-robustness. By establishing upper bounds on this property we are then able to apply the general technique of [Alwen-Block'16] for analyzing the hardware costs of an iMHF.
AU  - Alwen, Joel F
AU  - Gazi, Peter
AU  - Kamath Hosdurg, Chethan
AU  - Klein, Karen
AU  - Osang, Georg F
AU  - Pietrzak, Krzysztof Z
AU  - Reyzin, Lenoid
AU  - Rolinek, Michal
AU  - Rybar, Michal
ID  - 193
T2  - Proceedings of the 2018 on Asia Conference on Computer and Communication Security
TI  - On the memory hardness of data independent password hashing functions
ER  - 
TY  - JOUR
AB  - We consider the recent formulation of the algorithmic Lov ́asz Local Lemma  [N. Har-vey and J. Vondr ́ak, inProceedings of FOCS, 2015, pp. 1327–1345; D. Achlioptas and F. Iliopoulos,inProceedings of SODA, 2016, pp. 2024–2038; D. Achlioptas, F. Iliopoulos, and V. Kolmogorov,ALocal Lemma for Focused Stochastic Algorithms, arXiv preprint, 2018] for finding objects that avoid“bad  features,”  or  “flaws.”   It  extends  the  Moser–Tardos  resampling  algorithm  [R.  A.  Moser  andG. Tardos,J. ACM, 57 (2010), 11] to more general discrete spaces.  At each step the method picks aflaw present in the current state and goes to a new state according to some prespecified probabilitydistribution (which depends on the current state and the selected flaw).  However, the recent formu-lation is less flexible than the Moser–Tardos method since it requires a specific flaw selection rule,whereas the algorithm of Moser and Tardos allows an arbitrary rule (and thus can potentially beimplemented more efficiently).  We formulate a new “commutativity” condition and prove that it issufficient for an arbitrary rule to work.  It also enables an efficient parallelization under an additionalassumption.  We then show that existing resampling oracles for perfect matchings and permutationsdo satisfy this condition.
AU  - Kolmogorov, Vladimir
ID  - 5975
IS  - 6
JF  - SIAM Journal on Computing
SN  - 0097-5397
TI  - Commutativity in the algorithmic Lovász local lemma
VL  - 47
ER  - 
TY  - CONF
AB  - We consider the MAP-inference problem for graphical models,which is a valued constraint satisfaction problem defined onreal numbers with a natural summation operation. We proposea family of relaxations (different from the famous Sherali-Adams hierarchy), which naturally define lower bounds for itsoptimum. This family always contains a tight relaxation andwe give an algorithm able to find it and therefore, solve theinitial non-relaxed NP-hard problem.The relaxations we consider decompose the original probleminto two non-overlapping parts: an easy LP-tight part and adifficult one. For the latter part a combinatorial solver must beused. As we show in our experiments, in a number of applica-tions the second, difficult part constitutes only a small fractionof the whole problem. This property allows to significantlyreduce the computational time of the combinatorial solver andtherefore solve problems which were out of reach before.
AU  - Haller, Stefan
AU  - Swoboda, Paul
AU  - Savchynskyy, Bogdan
ID  - 5978
T2  - Proceedings of the 32st AAAI Conference on Artificial Intelligence
TI  - Exact MAP-inference by confining combinatorial search with LP relaxation
ER  - 
TY  - JOUR
AB  - An N-superconcentrator is a directed, acyclic graph with N input nodes and N output nodes such that every subset of the inputs and every subset of the outputs of same cardinality can be connected by node-disjoint paths. It is known that linear-size and bounded-degree superconcentrators exist. We prove the existence of such superconcentrators with asymptotic density 25.3 (where the density is the number of edges divided by N). The previously best known densities were 28 [12] and 27.4136 [17].
AU  - Kolmogorov, Vladimir
AU  - Rolinek, Michal
ID  - 18
IS  - 10
JF  - Ars Combinatoria
SN  - 0381-7032
TI  - Superconcentrators of density 25.3
VL  - 141
ER  - 
TY  - JOUR
AB  - The main result of this article is a generalization of the classical blossom algorithm for finding perfect matchings. Our algorithm can efficiently solve Boolean CSPs where each variable appears in exactly two constraints (we call it edge CSP) and all constraints are even Δ-matroid relations (represented by lists of tuples). As a consequence of this, we settle the complexity classification of planar Boolean CSPs started by Dvorak and Kupec. Using a reduction to even Δ-matroids, we then extend the tractability result to larger classes of Δ-matroids that we call efficiently coverable. It properly includes classes that were known to be tractable before, namely, co-independent, compact, local, linear, and binary, with the following caveat:We represent Δ-matroids by lists of tuples, while the last two use a representation by matrices. Since an n ×n matrix can represent exponentially many tuples, our tractability result is not strictly stronger than the known algorithm for linear and binary Δ-matroids.
AU  - Kazda, Alexandr
AU  - Kolmogorov, Vladimir
AU  - Rolinek, Michal
ID  - 6032
IS  - 2
JF  - ACM Transactions on Algorithms
TI  - Even delta-matroids and the complexity of planar boolean CSPs
VL  - 15
ER  - 
TY  - CONF
AB  - We introduce two novel methods for learning parameters of graphical models for image labelling. The following two tasks underline both methods: (i) perturb model parameters based on given features and ground truth labelings, so as to exactly reproduce these labelings as optima of the local polytope relaxation of the labelling problem; (ii) train a predictor for the perturbed model parameters so that improved model parameters can be applied to the labelling of novel data. Our first method implements task (i) by inverse linear programming and task (ii) using a regressor e.g. a Gaussian process. Our second approach simultaneously solves tasks (i) and (ii) in a joint manner, while being restricted to linearly parameterised predictors. Experiments demonstrate the merits of both approaches.
AU  - Trajkovska, Vera
AU  - Swoboda, Paul
AU  - Åström, Freddie
AU  - Petra, Stefanie
ED  - Lauze, François
ED  - Dong, Yiqiu
ED  - Bjorholm Dahl, Anders
ID  - 641
SN  - 978-331958770-7
TI  - Graphical model parameter learning by inverse linear programming
VL  - 10302
ER  - 
TY  - JOUR
AB  - An instance of the valued constraint satisfaction problem (VCSP) is given by a finite set of variables, a finite domain of labels, and a sum of functions, each function depending on a subset of the variables. Each function can take finite values specifying costs of assignments of labels to its variables or the infinite value, which indicates an infeasible assignment. The goal is to find an assignment of labels to the variables that minimizes the sum. We study, assuming that P 6= NP, how the complexity of this very general problem depends on the set of functions allowed in the instances, the so-called constraint language. The case when all allowed functions take values in f0;1g corresponds to ordinary CSPs, where one deals only with the feasibility issue, and there is no optimization. This case is the subject of the algebraic CSP dichotomy conjecture predicting for which constraint languages CSPs are tractable (i.e., solvable in polynomial time) and for which they are NP-hard. The case when all allowed functions take only finite values corresponds to a finitevalued CSP, where the feasibility aspect is trivial and one deals only with the optimization issue. The complexity of finite-valued CSPs was fully classified by Thapper and Živný. An algebraic necessary condition for tractability of a general-valued CSP with a fixed constraint language was recently given by Kozik and Ochremiak. As our main result, we prove that if a constraint language satisfies this algebraic necessary condition, and the feasibility CSP (i.e., the problem of deciding whether a given instance has a feasible solution) corresponding to the VCSP with this language is tractable, then the VCSP is tractable. The algorithm is a simple combination of the assumed algorithm for the feasibility CSP and the standard LP relaxation. As a corollary, we obtain that a dichotomy for ordinary CSPs would imply a dichotomy for general-valued CSPs.
AU  - Kolmogorov, Vladimir
AU  - Krokhin, Andrei
AU  - Rolinek, Michal
ID  - 644
IS  - 3
JF  - SIAM Journal on Computing
TI  - The complexity of general-valued CSPs
VL  - 46
ER  - 
TY  - CONF
AB  - We present a novel convex relaxation and a corresponding inference algorithm for the non-binary discrete tomography problem, that is, reconstructing discrete-valued images from few linear measurements. In contrast to state of the art approaches that split the problem into a continuous reconstruction problem for the linear measurement constraints and a discrete labeling problem to enforce discrete-valued reconstructions, we propose a joint formulation that addresses both problems simultaneously, resulting in a tighter convex relaxation. For this purpose a constrained graphical model is set up and evaluated using a novel relaxation optimized by dual decomposition. We evaluate our approach experimentally and show superior solutions both mathematically (tighter relaxation) and experimentally in comparison to previously proposed relaxations.
AU  - Kuske, Jan
AU  - Swoboda, Paul
AU  - Petra, Stefanie
ED  - Lauze, François
ED  - Dong, Yiqiu
ED  - Bjorholm Dahl, Anders
ID  - 646
SN  - 978-331958770-7
TI  - A novel convex relaxation for non binary discrete tomography
VL  - 10302
ER  - 
TY  - THES
AB  - An instance of the Constraint Satisfaction Problem (CSP) is given by a finite set of
variables, a finite domain of labels, and a set of constraints, each constraint acting on
a subset of the variables. The goal is to find an assignment of labels to its variables
that satisfies all constraints (or decide whether one exists). If we allow more general
“soft” constraints, which come with (possibly infinite) costs of particular assignments,
we obtain instances from a richer class called Valued Constraint Satisfaction Problem
(VCSP). There the goal is to find an assignment with minimum total cost.
In this thesis, we focus (assuming that P
6
=
NP) on classifying computational com-
plexity of CSPs and VCSPs under certain restricting conditions. Two results are the core
content of the work. In one of them, we consider VCSPs parametrized by a constraint
language, that is the set of “soft” constraints allowed to form the instances, and finish
the complexity classification modulo (missing pieces of) complexity classification for
analogously parametrized CSP. The other result is a generalization of Edmonds’ perfect
matching algorithm. This generalization contributes to complexity classfications in two
ways. First, it gives a new (largest known) polynomial-time solvable class of Boolean
CSPs in which every variable may appear in at most two constraints and second, it
settles full classification of Boolean CSPs with planar drawing (again parametrized by a
constraint language).
AU  - Rolinek, Michal
ID  - 992
SN  - 2663-337X
TI  - Complexity of constraint satisfaction
ER  - 
TY  - CONF
AB  - The main result of this paper is a generalization of the classical blossom algorithm for finding perfect matchings. Our algorithm can efficiently solve Boolean CSPs where each variable appears in exactly two constraints (we call it edge CSP) and all constraints are even Δ-matroid relations (represented by lists of tuples). As a consequence of this, we settle the complexity classification of planar Boolean CSPs started by Dvorak and Kupec. Knowing that edge CSP is tractable for even Δ-matroid constraints allows us to extend the tractability result to a larger class of Δ-matroids that includes many classes that were known to be tractable before, namely co-independent, compact, local and binary.
AU  - Kazda, Alexandr
AU  - Kolmogorov, Vladimir
AU  - Rolinek, Michal
ID  - 1192
SN  - 978-161197478-2
TI  - Even delta-matroids and the complexity of planar Boolean CSPs
ER  - 
TY  - CONF
AB  - We study the quadratic assignment problem, in computer vision also known as graph matching. Two leading solvers for this problem optimize the Lagrange decomposition duals with sub-gradient and dual ascent (also known as message passing) updates. We explore this direction further and propose several additional Lagrangean relaxations of the graph matching problem along with corresponding algorithms, which are all based on a common dual ascent framework. Our extensive empirical evaluation gives several theoretical insights and suggests a new state-of-the-art anytime solver for the considered problem. Our improvement over state-of-the-art is particularly visible on a new dataset with large-scale sparse problem instances containing more than 500 graph nodes each.
AU  - Swoboda, Paul
AU  - Rother, Carsten
AU  - Abu Alhaija, Carsten
AU  - Kainmueller, Dagmar
AU  - Savchynskyy, Bogdan
ID  - 916
SN  - 978-153860457-1
TI  - A study of lagrangean decompositions and dual ascent solvers for graph matching
VL  - 2017
ER  - 
TY  - CONF
AB  - We propose a dual decomposition and linear program relaxation of the NP-hard minimum cost multicut problem. Unlike other polyhedral relaxations of the multicut polytope, it is amenable to efficient optimization by message passing. Like other polyhedral relaxations, it can be tightened efficiently by cutting planes.  We define an algorithm that alternates between message passing and efficient separation of cycle- and odd-wheel inequalities. This algorithm is more efficient than state-of-the-art algorithms based on linear programming, including algorithms written in the framework of leading commercial software, as we show in experiments with large instances of the problem from applications in computer vision, biomedical image analysis and data mining.
AU  - Swoboda, Paul
AU  - Andres, Bjoern
ID  - 915
SN  - 978-153860457-1
TI  - A message passing algorithm for the minimum cost multicut problem
VL  - 2017
ER  - 
TY  - CONF
AB  - We  propose  a  general  dual  ascent  framework  for  Lagrangean decomposition of combinatorial problems.  Although methods of this type have shown their efficiency for a number of problems, so far there was no general algorithm applicable to multiple problem types. In this work, we propose such a general algorithm. It depends on several parameters, which can be used to optimize its performance in each particular setting. We demonstrate efficacy of our method on graph matching and multicut problems, where it outperforms state-of-the-art solvers including those based on subgradient optimization and off-the-shelf linear programming solvers.
AU  - Swoboda, Paul
AU  - Kuske, Jan
AU  - Savchynskyy, Bogdan
ID  - 917
SN  - 978-153860457-1
TI  - A dual ascent framework for Lagrangean decomposition of combinatorial problems
VL  - 2017
ER  - 
TY  - CONF
AB  - We consider the problem of estimating the partition function Z(β)=∑xexp(−β(H(x)) of a Gibbs distribution with a Hamilton H(⋅), or more precisely the logarithm of the ratio q=lnZ(0)/Z(β). It has been recently shown how to approximate q with high probability assuming the existence of an oracle that produces samples from the Gibbs distribution for a given parameter value in [0,β]. The current best known approach due to Huber [9] uses O(qlnn⋅[lnq+lnlnn+ε−2]) oracle calls on average where ε is the desired accuracy of approximation and H(⋅) is assumed to lie in {0}∪[1,n]. We improve the complexity to O(qlnn⋅ε−2) oracle calls. We also show that the same complexity can be achieved if exact oracles are replaced with approximate sampling oracles that are within O(ε2qlnn) variation distance from exact oracles. Finally, we prove a lower bound of Ω(q⋅ε−2) oracle calls under a natural model of computation.
AU  - Kolmogorov, Vladimir
ID  - 274
T2  - Proceedings of the 31st Conference On Learning Theory
TI  - A faster approximation algorithm for the Gibbs partition function
VL  - 75
ER  - 
TY  - CONF
AB  - We study the time-and memory-complexities of the problem of computing labels of (multiple) randomly selected challenge-nodes in a directed acyclic graph. The w-bit label of a node is the hash of the labels of its parents, and the hash function is modeled as a random oracle. Specific instances of this problem underlie both proofs of space [Dziembowski et al. CRYPTO’15] as well as popular memory-hard functions like scrypt. As our main tool, we introduce the new notion of a probabilistic parallel entangled pebbling game, a new type of combinatorial pebbling game on a graph, which is closely related to the labeling game on the same graph. As a first application of our framework, we prove that for scrypt, when the underlying hash function is invoked n times, the cumulative memory complexity (CMC) (a notion recently introduced by Alwen and Serbinenko (STOC’15) to capture amortized memory-hardness for parallel adversaries) is at least Ω(w · (n/ log(n))2). This bound holds for adversaries that can store many natural functions of the labels (e.g., linear combinations), but still not arbitrary functions thereof. We then introduce and study a combinatorial quantity, and show how a sufficiently small upper bound on it (which we conjecture) extends our CMC bound for scrypt to hold against arbitrary adversaries. We also show that such an upper bound solves the main open problem for proofs-of-space protocols: namely, establishing that the time complexity of computing the label of a random node in a graph on n nodes (given an initial kw-bit state) reduces tightly to the time complexity for black pebbling on the same graph (given an initial k-node pebbling).
AU  - Alwen, Joel F
AU  - Chen, Binyi
AU  - Kamath Hosdurg, Chethan
AU  - Kolmogorov, Vladimir
AU  - Pietrzak, Krzysztof Z
AU  - Tessaro, Stefano
ID  - 1231
TI  - On the complexity of scrypt and proofs of space in the parallel random oracle model
VL  - 9666
ER  - 
TY  - JOUR
AB  - We characterize absorption in finite idempotent algebras by means of Jónsson absorption and cube term blockers. As an application we show that it is decidable whether a given subset is an absorbing subuniverse of an algebra given by the tables of its basic operations.
AU  - Barto, Libor
AU  - Kazda, Alexandr
ID  - 1353
IS  - 5
JF  - International Journal of Algebra and Computation
TI  - Deciding absorption
VL  - 26
ER  - 
TY  - JOUR
AB  - We consider the problem of minimizing the continuous valued total variation subject to different unary terms on trees and propose fast direct algorithms based on dynamic programming to solve these problems. We treat both the convex and the nonconvex case and derive worst-case complexities that are equal to or better than existing methods. We show applications to total variation based two dimensional image processing and computer vision problems based on a Lagrangian decomposition approach. The resulting algorithms are very effcient, offer a high degree of parallelism, and come along with memory requirements which are only in the order of the number of image pixels.
AU  - Kolmogorov, Vladimir
AU  - Pock, Thomas
AU  - Rolinek, Michal
ID  - 1377
IS  - 2
JF  - SIAM Journal on Imaging Sciences
TI  - Total variation on a tree
VL  - 9
ER  - 
TY  - JOUR
AB  - We prove that whenever A is a 3-conservative relational structure with only binary and unary relations,then the algebra of polymorphisms of A either has no Taylor operation (i.e.,CSP(A)is NP-complete),or it generates an SD(∧) variety (i.e.,CSP(A)has bounded width).
AU  - Kazda, Alexandr
ID  - 1612
IS  - 1
JF  - Algebra Universalis
TI  - CSP for binary conservative relational structures
VL  - 75
ER  - 
TY  - CONF
AB  - We consider the recent formulation of the Algorithmic Lovász Local Lemma [1], [2] for finding objects that avoid &quot;bad features&quot;, or &quot;flaws&quot;. It extends the Moser-Tardos resampling algorithm [3] to more general discrete spaces. At each step the method picks a flaw present in the current state and &quot;resamples&quot; it using a &quot;resampling oracle&quot; provided by the user. However, it is less flexible than the Moser-Tardos method since [1], [2] require a specific flaw selection rule, whereas [3] allows an arbitrary rule (and thus can potentially be implemented more efficiently). We formulate a new &quot;commutativity&quot; condition, and prove that it is sufficient for an arbitrary rule to work. It also enables an efficient parallelization under an additional assumption. We then show that existing resampling oracles for perfect matchings and permutations do satisfy this condition. Finally, we generalize the precondition in [2] (in the case of symmetric potential causality graphs). This unifies special cases that previously were treated separately.
AU  - Kolmogorov, Vladimir
ID  - 1193
T2  - Proceedings - Annual IEEE Symposium on Foundations of Computer Science
TI  - Commutativity in the algorithmic Lovasz local lemma
VL  - 2016-December
ER  - 
TY  - JOUR
AB  - We consider Conditional random fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) (Formula presented.) is the sum of terms over intervals [i, j] where each term is non-zero only if the substring (Formula presented.) equals a prespecified pattern w. Such CRFs can be naturally applied to many sequence tagging problems. We present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively (Formula presented.), (Formula presented.) and (Formula presented.) where L is the combined length of input patterns, (Formula presented.) is the maximum length of a pattern, and D is the input alphabet. This improves on the previous algorithms of Ye et al. (NIPS, 2009) whose complexities are respectively (Formula presented.), (Formula presented.) and (Formula presented.), where (Formula presented.) is the number of input patterns. In addition, we give an efficient algorithm for sampling, and revisit the case of MAP with non-positive weights.
AU  - Kolmogorov, Vladimir
AU  - Takhanov, Rustem
ID  - 1794
IS  - 1
JF  - Algorithmica
TI  - Inference algorithms for pattern-based CRFs on sequence data
VL  - 76
ER  - 
TY  - CONF
AB  - Constraint Satisfaction Problem (CSP) is a fundamental algorithmic problem that appears in many areas of Computer Science. It can be equivalently stated as computing a homomorphism R→ΓΓ between two relational structures, e.g. between two directed graphs. Analyzing its complexity has been a prominent research direction, especially for the fixed template CSPs where the right side ΓΓ is fixed and the left side R is unconstrained.

Far fewer results are known for the hybrid setting that restricts both sides simultaneously. It assumes that R belongs to a certain class of relational structures (called a structural restriction in this paper). We study which structural restrictions are effective, i.e. there exists a fixed template ΓΓ (from a certain class of languages) for which the problem is tractable when R is restricted, and NP-hard otherwise. We provide a characterization for structural restrictions that are closed under inverse homomorphisms. The criterion is based on the chromatic number of a relational structure defined in this paper; it generalizes the standard chromatic number of a graph.

As our main tool, we use the algebraic machinery developed for fixed template CSPs. To apply it to our case, we introduce a new construction called a “lifted language”. We also give a characterization for structural restrictions corresponding to minor-closed families of graphs, extend results to certain Valued CSPs (namely conservative valued languages), and state implications for (valued) CSPs with ordered variables and for the maximum weight independent set problem on some restricted families of graphs.
AU  - Kolmogorov, Vladimir
AU  - Rolinek, Michal
AU  - Takhanov, Rustem
ID  - 1636
SN  - 978-3-662-48970-3
T2  - 26th International Symposium
TI  - Effectiveness of structural restrictions for hybrid CSPs
VL  - 9472
ER  - 
TY  - JOUR
AB  - We propose a new family of message passing techniques for MAP estimation in graphical models which we call Sequential Reweighted Message Passing (SRMP). Special cases include well-known techniques such as Min-Sum Diffusion (MSD) and a faster Sequential Tree-Reweighted Message Passing (TRW-S). Importantly, our derivation is simpler than the original derivation of TRW-S, and does not involve a decomposition into trees. This allows easy generalizations. The new family of algorithms can be viewed as a generalization of TRW-S from pairwise to higher-order graphical models. We test SRMP on several real-world problems with promising results.
AU  - Kolmogorov, Vladimir
ID  - 1841
IS  - 5
JF  - IEEE Transactions on Pattern Analysis and Machine Intelligence
TI  - A new look at reweighted message passing
VL  - 37
ER  - 
TY  - CONF
AB  - Structural support vector machines (SSVMs) are amongst the best performing models for structured computer vision tasks, such as semantic image segmentation or human pose estimation. Training SSVMs, however, is computationally costly, because it requires repeated calls to a structured prediction subroutine (called \emph{max-oracle}), which has to solve an optimization problem itself, e.g. a graph cut.
In this work, we introduce a new algorithm for SSVM training that is more efficient than earlier techniques when the max-oracle is computationally expensive, as it is frequently the case in computer vision tasks. The main idea is to (i) combine the recent stochastic Block-Coordinate Frank-Wolfe algorithm with efficient hyperplane caching, and (ii) use an automatic selection rule for deciding whether to call the exact max-oracle or to rely on an approximate one based on the cached hyperplanes.
We show experimentally that this strategy leads to faster convergence to the optimum with respect to the number of requires oracle calls, and that this translates into faster convergence with respect to the total runtime when the max-oracle is slow compared to the other steps of the algorithm. 
AU  - Shah, Neel
AU  - Kolmogorov, Vladimir
AU  - Lampert, Christoph
ID  - 1859
TI  - A multi-plane block-coordinate Frank-Wolfe algorithm for training structural SVMs with a costly max-oracle
ER  - 
TY  - JOUR
AB  - A class of valued constraint satisfaction problems (VCSPs) is characterised by a valued constraint language, a fixed set of cost functions on a finite domain. Finite-valued constraint languages contain functions that take on rational costs and general-valued constraint languages contain functions that take on rational or infinite costs. An instance of the problem is specified by a sum of functions from the language with the goal to minimise the sum. This framework includes and generalises well-studied constraint satisfaction problems (CSPs) and maximum constraint satisfaction problems (Max-CSPs).
Our main result is a precise algebraic characterisation of valued constraint languages whose instances can be solved exactly by the basic linear programming relaxation (BLP). For a general-valued constraint language Γ, BLP is a decision procedure for Γ if and only if Γ admits a symmetric fractional polymorphism of every arity. For a finite-valued constraint language Γ, BLP is a decision procedure if and only if Γ admits a symmetric fractional polymorphism of some arity, or equivalently, if Γ admits a symmetric fractional polymorphism of arity 2.
Using these results, we obtain tractability of several novel and previously widely-open classes of VCSPs, including problems over valued constraint languages that are: (1) submodular on arbitrary lattices; (2) bisubmodular (also known as k-submodular) on arbitrary finite domains; (3) weakly (and hence strongly) tree-submodular on arbitrary trees. 
AU  - Kolmogorov, Vladimir
AU  - Thapper, Johan
AU  - Živný, Stanislav
ID  - 2271
IS  - 1
JF  - SIAM Journal on Computing
TI  - The power of linear programming for general-valued CSPs
VL  - 44
ER  - 
TY  - CONF
AB  - An instance of the Valued Constraint Satisfaction Problem (VCSP) is given by a finite set of variables, a finite domain of labels, and a sum of functions, each function depending on a subset of the variables. Each function can take finite values specifying costs of assignments of labels to its variables or the infinite value, which indicates an infeasible assignment. The goal is to find an assignment of labels to the variables that minimizes the sum. We study, assuming that P ≠ NP, how the complexity of this very general problem depends on the set of functions allowed in the instances, the so-called constraint language. The case when all allowed functions take values in {0, ∞} corresponds to ordinary CSPs, where one deals only with the feasibility issue and there is no optimization. This case is the subject of the Algebraic CSP Dichotomy Conjecture predicting for which constraint languages CSPs are tractable (i.e. solvable in polynomial time) and for which NP-hard. The case when all allowed functions take only finite values corresponds to finite-valued CSP, where the feasibility aspect is trivial and one deals only with the optimization issue. The complexity of finite-valued CSPs was fully classified by Thapper and Zivny. An algebraic necessary condition for tractability of a general-valued CSP with a fixed constraint language was recently given by Kozik and Ochremiak. As our main result, we prove that if a constraint language satisfies this algebraic necessary condition, and the feasibility CSP (i.e. the problem of deciding whether a given instance has a feasible solution) corresponding to the VCSP with this language is tractable, then the VCSP is tractable. The algorithm is a simple combination of the assumed algorithm for the feasibility CSP and the standard LP relaxation. As a corollary, we obtain that a dichotomy for ordinary CSPs would imply a dichotomy for general-valued CSPs.
AU  - Kolmogorov, Vladimir
AU  - Krokhin, Andrei
AU  - Rolinek, Michal
ID  - 1637
TI  - The complexity of general-valued CSPs
ER  - 
TY  - CONF
AB  - Proofs of work (PoW) have been suggested by Dwork and Naor (Crypto’92) as protection to a shared resource. The basic idea is to ask the service requestor to dedicate some non-trivial amount of computational work to every request. The original applications included prevention of spam and protection against denial of service attacks. More recently, PoWs have been used to prevent double spending in the Bitcoin digital currency system. In this work, we put forward an alternative concept for PoWs - so-called proofs of space (PoS), where a service requestor must dedicate a significant amount of disk space as opposed to computation. We construct secure PoS schemes in the random oracle model (with one additional mild assumption required for the proof to go through), using graphs with high “pebbling complexity” and Merkle hash-trees. We discuss some applications, including follow-up work where a decentralized digital currency scheme called Spacecoin is constructed that uses PoS (instead of wasteful PoW like in Bitcoin) to prevent double spending. The main technical contribution of this work is the construction of (directed, loop-free) graphs on N vertices with in-degree O(log logN) such that even if one places Θ(N) pebbles on the nodes of the graph, there’s a constant fraction of nodes that needs Θ(N) steps to be pebbled (where in every step one can put a pebble on a node if all its parents have a pebble).
AU  - Dziembowski, Stefan
AU  - Faust, Sebastian
AU  - Kolmogorov, Vladimir
AU  - Pietrzak, Krzysztof Z
ID  - 1675
SN  - 0302-9743
T2  - 35th Annual Cryptology Conference
TI  - Proofs of space
VL  - 9216
ER  - 
TY  - CONF
AB  - Energies with high-order non-submodular interactions have been shown to be very useful in vision due to their high modeling power. Optimization of such energies, however, is generally NP-hard. A naive approach that works for small problem instances is exhaustive search, that is, enumeration of all possible labelings of the underlying graph. We propose a general minimization approach for large graphs based on enumeration of labelings of certain small patches. 
This partial enumeration technique reduces complex high-order energy formulations to pairwise Constraint Satisfaction Problems with unary costs (uCSP), which can be efficiently solved using standard methods like TRW-S. Our approach outperforms a number of existing state-of-the-art algorithms on well known difficult problems (e.g. curvature regularization, stereo, deconvolution); it gives near global minimum and better speed. 
Our main application of interest is curvature regularization. In the context of segmentation, our partial enumeration technique allows to evaluate curvature directly on small patches using a novel integral geometry approach.

AU  - Olsson, Carl
AU  - Ulen, Johannes
AU  - Boykov, Yuri
AU  - Kolmogorov, Vladimir
ID  - 2275
TI  - Partial enumeration and curvature regularization
ER  - 
TY  - GEN
AU  - Huszár, Kristóf
AU  - Rolinek, Michal
ID  - 7038
TI  - Playful Math - An introduction to mathematical games
ER  - 
TY  - CONF
AB  - Representation languages for coalitional games are a key research area in algorithmic game theory.   There is an inher-
ent tradeoff between how general a language is, allowing it to  capture  more  elaborate  games,  and  how  hard  it  is  computationally to optimize and solve such games.  One prominent  such  language  is  the  simple  yet  expressive
Weighted Graph Games  (WGGs) representation (Deng  and Papadimitriou 1994), which maintains knowledge about synergies between agents in the form of an edge weighted graph. We  consider  the  problem  of  finding  the  optimal  coalition structure in WGGs. The agents in such games are vertices in a graph, and the value of a coalition is the sum of the weights of the edges present between coalition members. The optimal coalition structure is a partition of the agents to coalitions, that maximizes the sum of utilities obtained by the coalitions. We  show  that  finding  the  optimal  coalition  structure  is  not only hard for general graphs,  but is also intractable for restricted families such as planar graphs which are amenable for many other combinatorial problems.  We then provide algorithms with constant factor approximations for planar, minorfree and bounded degree graphs.
AU  - Bachrach, Yoram
AU  - Kohli, Pushmeet
AU  - Kolmogorov, Vladimir
AU  - Zadimoghaddam, Morteza
ID  - 2270
TI  - Optimal Coalition Structures in Cooperative Graph Games
ER  - 
TY  - GEN
AB  - We propose a new family of message passing techniques for MAP estimation in graphical models which we call Sequential Reweighted Message Passing (SRMP). Special cases include well-known techniques such as Min-Sum Diusion (MSD) and a faster Sequential Tree-Reweighted Message Passing (TRW-S). Importantly, our derivation is simpler than the original derivation of TRW-S, and does not involve a  decomposition into trees. This allows easy generalizations. We present such a generalization for the case of higher-order graphical models, and test it on several real-world problems with promising results.
AU  - Vladimir Kolmogorov
ID  - 2273
TI  - Reweighted message passing revisited
ER  - 
TY  - CONF
AB  - The problem of minimizing the Potts energy function frequently occurs in computer vision applications. One way to tackle this NP-hard problem was proposed by Kovtun [19, 20]. It identifies a part of an optimal solution by running k maxflow computations, where k is the number of labels. The number of “labeled” pixels can be significant in some applications, e.g. 50-93% in our tests for stereo. We show how to reduce the runtime to O (log k) maxflow computations (or one parametric maxflow computation). Furthermore, the output of our algorithm allows to speed-up the subsequent alpha expansion for the unlabeled part, or can be used as it is for time-critical applications. To derive our technique, we generalize the algorithm of Felzenszwalb et al. [7] for Tree Metrics . We also show a connection to k-submodular functions from combinatorial optimization, and discuss k-submodular relaxations for general energy functions.
AU  - Gridchyn, Igor
AU  - Kolmogorov, Vladimir
ID  - 2276
TI  - Potts model, parametric maxflow and k-submodular functions
ER  - 
TY  - CONF
AB  - A class of valued constraint satisfaction problems (VCSPs) is characterised by a valued constraint language, a fixed set of cost functions on a finite domain. An instance of the problem is specified by a sum of cost functions from the language with the goal to minimise the sum. We study which classes of finite-valued languages can be solved exactly by the basic linear programming relaxation (BLP). Thapper and Živný showed [20] that if BLP solves the language then the language admits a binary commutative fractional polymorphism. We prove that the converse is also true. This leads to a necessary and a sufficient condition which can be checked in polynomial time for a given language. In contrast, the previous necessary and sufficient condition due to [20] involved infinitely many inequalities. More recently, Thapper and Živný [21] showed (using, in particular, a technique introduced in this paper) that core languages that do not satisfy our condition are NP-hard. Taken together, these results imply that a finite-valued language can either be solved using Linear Programming or is NP-hard.
AU  - Kolmogorov, Vladimir
ID  - 2518
IS  - 1
TI  - The power of linear programming for finite-valued CSPs: A constructive characterization
VL  - 7965
ER  - 
TY  - JOUR
AB  - We study the complexity of valued constraint satisfaction problems (VCSPs) parametrized by a constraint language, a fixed set of cost functions over a finite domain. An instance of the problem is specified by a sum of cost functions from the language and the goal is to minimize the sum. Under the unique games conjecture, the approximability of finite-valued VCSPs is well understood, see Raghavendra [2008]. However, there is no characterization of finite-valued VCSPs, let alone general-valued VCSPs, that can be solved exactly in polynomial time, thus giving insights from a combinatorial optimization perspective. We consider the case of languages containing all possible unary cost functions. In the case of languages consisting of only {0, ∞}-valued cost functions (i.e., relations), such languages have been called conservative and studied by Bulatov [2003, 2011] and recently by Barto [2011]. Since we study valued languages, we call a language conservative if it contains all finite-valued unary cost functions. The computational complexity of conservative valued languages has been studied by Cohen et al. [2006] for languages over Boolean domains, by Deineko et al. [2008] for {0, 1}-valued languages (a.k.a Max-CSP), and by Takhanov [2010a] for {0, ∞}-valued languages containing all finite-valued unary cost functions (a.k.a. Min-Cost-Hom). We prove a Schaefer-like dichotomy theorem for conservative valued languages: if all cost functions in the language satisfy a certain condition (specified by a complementary combination of STP and MJN multimor-phisms), then any instance can be solved in polynomial time (via a new algorithm developed in this article), otherwise the language is NP-hard. This is the first complete complexity classification of general-valued constraint languages over non-Boolean domains. It is a common phenomenon that complexity classifications of problems over non-Boolean domains are significantly harder than the Boolean cases. The polynomial-time algorithm we present for the tractable cases is a generalization of the submodular minimization problem and a result of Cohen et al. [2008]. Our results generalize previous results by Takhanov [2010a] and (a subset of results) by Cohen et al. [2006] and Deineko et al. [2008]. Moreover, our results do not rely on any computer-assisted search as in Deineko et al. [2008], and provide a powerful tool for proving hardness of finite-valued and general-valued languages.
AU  - Kolmogorov, Vladimir
AU  - Živný, Stanislav
ID  - 2828
IS  - 2
JF  - Journal of the ACM
TI  - The complexity of conservative valued CSPs
VL  - 60
ER  - 
TY  - CONF
AB  -  We introduce the M-modes problem for graphical models: predicting the M label configurations of highest probability that are at the same time local maxima of the probability landscape. M-modes have multiple possible applications: because they are intrinsically diverse, they provide a principled alternative to non-maximum suppression techniques for structured prediction, they can act as codebook vectors for quantizing the configuration space, or they can form component centers for mixture model approximation. We present two algorithms for solving the M-modes problem. The first algorithm solves the problem in polynomial time when the underlying graphical model is a simple chain. The second algorithm solves the problem for junction chains. In synthetic and real dataset, we demonstrate how M-modes can improve the performance of prediction. We also use the generated modes as a tool to understand the topography of the probability distribution of configurations, for example with relation to the training set size and amount of noise in the data. 
AU  - Chen, Chao
AU  - Kolmogorov, Vladimir
AU  - Yan, Zhu
AU  - Metaxas, Dimitris
AU  - Lampert, Christoph
ID  - 2901
TI  - Computing the M most probable modes of a graphical model
VL  - 31
ER  - 
TY  - CONF
AB  - We consider Conditional Random Fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) x1...xn is the sum of terms over intervals [i,j] where each term is non-zero only if the substring xi...xj equals a prespecified pattern α. Such CRFs can be naturally applied to many sequence tagging problems.
We present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively O(nL), O(nLℓmax) and O(nLmin{|D|,log(ℓmax+1)}) where L is the combined length of input patterns, ℓmax is the maximum length of a pattern, and D is the input alphabet. This improves on the previous algorithms of (Ye et al., 2009) whose complexities are respectively O(nL|D|), O(n|Γ|L2ℓ2max) and O(nL|D|), where |Γ| is the number of input patterns.
In addition, we give an efficient algorithm for sampling. Finally, we consider the case of non-positive weights. (Komodakis &amp; Paragios, 2009) gave an O(nL) algorithm for computing the MAP. We present a modification that has the same worst-case complexity but can beat it in the best case. 
AU  - Takhanov, Rustem
AU  - Kolmogorov, Vladimir
ID  - 2272
IS  - 3
T2  - ICML'13 Proceedings of the 30th International Conference on International
TI  - Inference algorithms for pattern-based CRFs on sequence data
VL  - 28
ER  - 
TY  - GEN
AB  - Proofs of work (PoW) have been suggested by Dwork and Naor (Crypto'92) as protection to a shared resource. The basic idea is to ask the service requestor to dedicate some non-trivial amount of computational work to every request. The original applications included prevention of spam and protection against denial of service attacks. More recently, PoWs have been used to prevent double spending in the Bitcoin digital currency system.

In this work, we put forward an alternative concept for PoWs -- so-called proofs of space (PoS), where a service requestor must dedicate a significant amount of disk space as opposed to computation. We construct secure PoS schemes in the random oracle model, using graphs with high &quot;pebbling complexity&quot; and Merkle hash-trees. 
AU  - Dziembowski, Stefan
AU  - Faust, Sebastian
AU  - Kolmogorov, Vladimir
AU  - Pietrzak, Krzysztof Z
ID  - 2274
TI  - Proofs of Space
ER  - 
TY  - CONF
AB  - In this paper we investigate k-submodular functions. This natural family of discrete functions includes submodular and bisubmodular functions as the special cases k = 1 and k = 2 respectively.

In particular we generalize the known Min-Max-Theorem for submodular and bisubmodular functions. This theorem asserts that the minimum of the (bi)submodular function can be found by solving a maximization problem over a (bi)submodular polyhedron. We define a k-submodular polyhedron, prove a Min-Max-Theorem for k-submodular functions, and give a greedy algorithm to construct the vertices of the polyhedron.

AU  - Huber, Anna
AU  - Kolmogorov, Vladimir
ID  - 2930
TI  - Towards minimizing k-submodular functions
VL  - 7422
ER  - 
TY  - GEN
AB  -      This paper addresses the problem of approximate MAP-MRF inference in general graphical models. Following [36], we consider a family of linear programming relaxations of the problem where each relaxation is specified by a set of nested pairs of factors for which the marginalization constraint needs to be enforced. We develop a generalization of the TRW-S algorithm [9] for this problem, where we use a decomposition into junction chains, monotonic w.r.t. some ordering on the nodes. This generalizes the monotonic chains in [9] in a natural way. We also show how to deal with nested factors in an efficient way. Experiments show an improvement over min-sum diffusion, MPLP and subgradient ascent algorithms on a number of computer vision and natural language processing problems. 
AU  - Kolmogorov, Vladimir
AU  - Schoenemann, Thomas
ID  - 2928
T2  - arXiv
TI  - Generalized sequential tree-reweighted message passing
ER  - 
TY  - JOUR
AB  - In this paper, we present a new approach for establishing correspondences between sparse image features related by an unknown nonrigid mapping and corrupted by clutter and occlusion, such as points extracted from images of different instances of the same object category. We formulate this matching task as an energy minimization problem by defining an elaborate objective function of the appearance and the spatial arrangement of the features. Optimization of this energy is an instance of graph matching, which is in general an NP-hard problem. We describe a novel graph matching optimization technique, which we refer to as dual decomposition (DD), and demonstrate on a variety of examples that this method outperforms existing graph matching algorithms. In the majority of our examples, DD is able to find the global minimum within a minute. The ability to globally optimize the objective allows us to accurately learn the parameters of our matching model from training examples. We show on several matching tasks that our learned model yields results superior to those of state-of-the-art methods.

AU  - Torresani, Lorenzo
AU  - Kolmogorov, Vladimir
AU  - Rother, Carsten
ID  - 2931
IS  - 2
JF  - IEEE Transactions on Pattern Analysis and Machine Intelligence
TI  - A dual decomposition approach to feature correspondence
VL  - 35
ER  - 
TY  - JOUR
AB  - We consider the problem of minimizing a function represented as a sum of submodular terms. We assume each term allows an efficient computation of exchange capacities. This holds, for example, for terms depending on a small number of variables, or for certain cardinality-dependent terms. A naive application of submodular minimization algorithms would not exploit the existence of specialized exchange capacity subroutines for individual terms. To overcome this, we cast the problem as a submodular flow (SF) problem in an auxiliary graph in such a way that applying most existing SF algorithms would rely only on these subroutines. We then explore in more detail Iwata's capacity scaling approach for submodular flows (Iwata 1997 [19]). In particular, we show how to improve its complexity in the case when the function contains cardinality-dependent terms.
AU  - Kolmogorov, Vladimir
ID  - 3117
IS  - 15
JF  - Discrete Applied Mathematics
TI  - Minimizing a sum of submodular functions
VL  - 160
ER  - 
TY  - JOUR
AB  - Consider a convex relaxation f̂ of a pseudo-Boolean function f. We say that the relaxation is totally half-integral if f̂(x) is a polyhedral function with half-integral extreme points x, and this property is preserved after adding an arbitrary combination of constraints of the form x i=x j, x i=1-x j, and x i=γ where γ∈{0,1,1/2} is a constant. A well-known example is the roof duality relaxation for quadratic pseudo-Boolean functions f. We argue that total half-integrality is a natural requirement for generalizations of roof duality to arbitrary pseudo-Boolean functions. Our contributions are as follows. First, we provide a complete characterization of totally half-integral relaxations f̂ by establishing a one-to-one correspondence with bisubmodular functions. Second, we give a new characterization of bisubmodular functions. Finally, we show some relationships between general totally half-integral relaxations and relaxations based on the roof duality. On the conceptual level, our results show that bisubmodular functions provide a natural generalization of the roof duality approach to higher-order terms. This can be viewed as a non-submodular analogue of the fact that submodular functions generalize the s-t minimum cut problem with non-negative weights to higher-order terms.
AU  - Kolmogorov, Vladimir
ID  - 3257
IS  - 4-5
JF  - Discrete Applied Mathematics
TI  - Generalized roof duality and bisubmodular functions
VL  - 160
ER  - 
TY  - CONF
AB  - We consider the problem of inference in a graphical model with binary variables. While in theory it is arguably preferable to compute marginal probabilities, in practice researchers often use MAP inference due to the availability of efficient discrete optimization algorithms. We bridge the gap between the two approaches by introducing the Discrete Marginals technique in which approximate marginals are obtained by minimizing an objective function with unary and pairwise terms over a discretized domain. This allows the use of techniques originally developed for MAP-MRF inference and learning. We explore two ways to set up the objective function - by discretizing the Bethe free energy and by learning it from training data. Experimental results show that for certain types of graphs a learned function can outperform the Bethe approximation. We also establish a link between the Bethe free energy and submodular functions.

AU  - Korc, Filip
AU  - Kolmogorov, Vladimir
AU  - Lampert, Christoph
ID  - 3124
TI  - Approximating marginals using discrete energy minimization
ER  - 
TY  - GEN
AB  - We consider the problem of inference in agraphical model with binary variables. While in theory it is arguably preferable to compute marginal probabilities, in practice researchers often use MAP inference due to the availability of efficient discrete optimization algorithms. We bridge the gap between the two approaches by introducing the Discrete  Marginals technique in which approximate marginals are obtained by minimizing an objective function with unary and pair-wise terms over a discretized domain. This allows the use of techniques originally devel-oped for MAP-MRF inference and learning. We explore two ways to set up the objective function - by discretizing the Bethe free energy and by learning it  from training data. Experimental results show that for certain types of graphs a learned function can out-perform the  Bethe approximation. We also establish a link between the Bethe free energy and submodular functions.
AU  - Korc, Filip
AU  - Kolmogorov, Vladimir
AU  - Lampert, Christoph
ID  - 5396
SN  - 2664-1690
TI  - Approximating marginals using discrete energy minimization
ER  -