TY - JOUR AB - Given a fixed finite metric space (V,μ), the {\em minimum 0-extension problem}, denoted as 0-Ext[μ], is equivalent to the following optimization problem: minimize function of the form minx∈Vn∑ifi(xi)+∑ijcijμ(xi,xj) where cij,cvi are given nonnegative costs and fi:V→R are functions given by fi(xi)=∑v∈Vcviμ(xi,v). The computational complexity of 0-Ext[μ] has been recently established by Karzanov and by Hirai: if metric μ is {\em orientable modular} then 0-Ext[μ] can be solved in polynomial time, otherwise 0-Ext[μ] is NP-hard. To prove the tractability part, Hirai developed a theory of discrete convex functions on orientable modular graphs generalizing several known classes of functions in discrete convex analysis, such as L♮-convex functions. We consider a more general version of the problem in which unary functions fi(xi) can additionally have terms of the form cuv;iμ(xi,{u,v}) for {u,v}∈F, where set F⊆(V2) is fixed. We extend the complexity classification above by providing an explicit condition on (μ,F) for the problem to be tractable. In order to prove the tractability part, we generalize Hirai's theory and define a larger class of discrete convex functions. It covers, in particular, another well-known class of functions, namely submodular functions on an integer lattice. Finally, we improve the complexity of Hirai's algorithm for solving 0-Ext on orientable modular graphs. AU - Dvorak, Martin AU - Kolmogorov, Vladimir ID - 10045 JF - Mathematical Programming KW - minimum 0-extension problem KW - metric labeling problem KW - discrete metric spaces KW - metric extensions KW - computational complexity KW - valued constraint satisfaction problems KW - discrete convex analysis KW - L-convex functions SN - 0025-5610 TI - Generalized minimum 0-extension problem and discrete convexity ER - TY - CONF AB - A central problem in computational statistics is to convert a procedure for sampling combinatorial objects into a procedure for counting those objects, and vice versa. We will consider sampling problems which come from Gibbs distributions, which are families of probability distributions over a discrete space Ω with probability mass function of the form μ^Ω_β(ω) ∝ e^{β H(ω)} for β in an interval [β_min, β_max] and H(ω) ∈ {0} ∪ [1, n]. The partition function is the normalization factor Z(β) = ∑_{ω ∈ Ω} e^{β H(ω)}, and the log partition ratio is defined as q = (log Z(β_max))/Z(β_min) We develop a number of algorithms to estimate the counts c_x using roughly Õ(q/ε²) samples for general Gibbs distributions and Õ(n²/ε²) samples for integer-valued distributions (ignoring some second-order terms and parameters), We show this is optimal up to logarithmic factors. We illustrate with improved algorithms for counting connected subgraphs and perfect matchings in a graph. AU - Harris, David G. AU - Kolmogorov, Vladimir ID - 14084 SN - 1868-8969 T2 - 50th International Colloquium on Automata, Languages, and Programming TI - Parameter estimation for Gibbs distributions VL - 261 ER - TY - CONF AB - We consider the problem of solving LP relaxations of MAP-MRF inference problems, and in particular the method proposed recently in [16], [35]. As a key computational subroutine, it uses a variant of the Frank-Wolfe (FW) method to minimize a smooth convex function over a combinatorial polytope. We propose an efficient implementation of this subroutine based on in-face Frank-Wolfe directions, introduced in [4] in a different context. More generally, we define an abstract data structure for a combinatorial subproblem that enables in-face FW directions, and describe its specialization for tree-structured MAP-MRF inference subproblems. Experimental results indicate that the resulting method is the current state-of-art LP solver for some classes of problems. Our code is available at pub.ist.ac.at/~vnk/papers/IN-FACE-FW.html. AU - Kolmogorov, Vladimir ID - 14448 SN - 1063-6919 T2 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition TI - Solving relaxations of MAP-MRF problems: Combinatorial in-face Frank-Wolfe directions VL - 2023 ER - TY - JOUR AB - We consider two models for the sequence labeling (tagging) problem. The first one is a Pattern-Based Conditional Random Field (PB), in which the energy of a string (chain labeling) x=x1⁢…⁢xn∈Dn is a sum of terms over intervals [i,j] where each term is non-zero only if the substring xi⁢…⁢xj equals a prespecified word w∈Λ. The second model is a Weighted Context-Free Grammar (WCFG) frequently used for natural language processing. PB and WCFG encode local and non-local interactions respectively, and thus can be viewed as complementary. We propose a Grammatical Pattern-Based CRF model (GPB) that combines the two in a natural way. We argue that it has certain advantages over existing approaches such as the Hybrid model of Benedí and Sanchez that combines N-grams and WCFGs. The focus of this paper is to analyze the complexity of inference tasks in a GPB such as computing MAP. We present a polynomial-time algorithm for general GPBs and a faster version for a special case that we call Interaction Grammars. AU - Takhanov, Rustem AU - Kolmogorov, Vladimir ID - 10737 IS - 1 JF - Intelligent Data Analysis SN - 1088-467X TI - Combining pattern-based CRFs and weighted context-free grammars VL - 26 ER - TY - CONF AB - The Lovász Local Lemma (LLL) is a powerful tool in probabilistic combinatorics which can be used to establish the existence of objects that satisfy certain properties. The breakthrough paper of Moser and Tardos and follow-up works revealed that the LLL has intimate connections with a class of stochastic local search algorithms for finding such desirable objects. In particular, it can be seen as a sufficient condition for this type of algorithms to converge fast. Besides conditions for existence of and fast convergence to desirable objects, one may naturally ask further questions regarding properties of these algorithms. For instance, "are they parallelizable?", "how many solutions can they output?", "what is the expected "weight" of a solution?", etc. These questions and more have been answered for a class of LLL-inspired algorithms called commutative. In this paper we introduce a new, very natural and more general notion of commutativity (essentially matrix commutativity) which allows us to show a number of new refined properties of LLL-inspired local search algorithms with significantly simpler proofs. AU - Harris, David G. AU - Iliopoulos, Fotis AU - Kolmogorov, Vladimir ID - 10072 SN - 1868-8969 T2 - Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques TI - A new notion of commutativity for the algorithmic Lovász Local Lemma VL - 207 ER - TY - CONF AB - We study a class of convex-concave saddle-point problems of the form minxmaxy⟨Kx,y⟩+fP(x)−h∗(y) where K is a linear operator, fP is the sum of a convex function f with a Lipschitz-continuous gradient and the indicator function of a bounded convex polytope P, and h∗ is a convex (possibly nonsmooth) function. Such problem arises, for example, as a Lagrangian relaxation of various discrete optimization problems. Our main assumptions are the existence of an efficient linear minimization oracle (lmo) for fP and an efficient proximal map for h∗ which motivate the solution via a blend of proximal primal-dual algorithms and Frank-Wolfe algorithms. In case h∗ is the indicator function of a linear constraint and function f is quadratic, we show a O(1/n2) convergence rate on the dual objective, requiring O(nlogn) calls of lmo. If the problem comes from the constrained optimization problem minx∈Rd{fP(x)|Ax−b=0} then we additionally get bound O(1/n2) both on the primal gap and on the infeasibility gap. In the most general case, we show a O(1/n) convergence rate of the primal-dual gap again requiring O(nlogn) calls of lmo. To the best of our knowledge, this improves on the known convergence rates for the considered class of saddle-point problems. We show applications to labeling problems frequently appearing in machine learning and computer vision. AU - Kolmogorov, Vladimir AU - Pock, Thomas ID - 10552 T2 - 38th International Conference on Machine Learning TI - One-sided Frank-Wolfe algorithms for saddle problems ER - TY - CONF AB - A Valued Constraint Satisfaction Problem (VCSP) provides a common framework that can express a wide range of discrete optimization problems. A VCSP instance is given by a finite set of variables, a finite domain of labels, and an objective function to be minimized. This function is represented as a sum of terms where each term depends on a subset of the variables. To obtain different classes of optimization problems, one can restrict all terms to come from a fixed set Γ of cost functions, called a language. Recent breakthrough results have established a complete complexity classification of such classes with respect to language Γ: if all cost functions in Γ satisfy a certain algebraic condition then all Γ-instances can be solved in polynomial time, otherwise the problem is NP-hard. Unfortunately, testing this condition for a given language Γ is known to be NP-hard. We thus study exponential algorithms for this meta-problem. We show that the tractability condition of a finite-valued language Γ can be tested in O(3‾√3|D|⋅poly(size(Γ))) time, where D is the domain of Γ and poly(⋅) is some fixed polynomial. We also obtain a matching lower bound under the Strong Exponential Time Hypothesis (SETH). More precisely, we prove that for any constant δ<1 there is no O(3‾√3δ|D|) algorithm, assuming that SETH holds. AU - Kolmogorov, Vladimir ID - 6725 SN - 1868-8969 T2 - 46th International Colloquium on Automata, Languages and Programming TI - Testing the complexity of a valued CSP language VL - 132 ER - TY - JOUR AB - We develop a framework for the rigorous analysis of focused stochastic local search algorithms. These algorithms search a state space by repeatedly selecting some constraint that is violated in the current state and moving to a random nearby state that addresses the violation, while (we hope) not introducing many new violations. An important class of focused local search algorithms with provable performance guarantees has recently arisen from algorithmizations of the Lovász local lemma (LLL), a nonconstructive tool for proving the existence of satisfying states by introducing a background measure on the state space. While powerful, the state transitions of algorithms in this class must be, in a precise sense, perfectly compatible with the background measure. In many applications this is a very restrictive requirement, and one needs to step outside the class. Here we introduce the notion of measure distortion and develop a framework for analyzing arbitrary focused stochastic local search algorithms, recovering LLL algorithmizations as the special case of no distortion. Our framework takes as input an arbitrary algorithm of such type and an arbitrary probability measure and shows how to use the measure as a yardstick of algorithmic progress, even for algorithms designed independently of the measure. AU - Achlioptas, Dimitris AU - Iliopoulos, Fotis AU - Kolmogorov, Vladimir ID - 7412 IS - 5 JF - SIAM Journal on Computing SN - 0097-5397 TI - A local lemma for focused stochastical algorithms VL - 48 ER - TY - CONF AB - We present a new proximal bundle method for Maximum-A-Posteriori (MAP) inference in structured energy minimization problems. The method optimizes a Lagrangean relaxation of the original energy minimization problem using a multi plane block-coordinate Frank-Wolfe method that takes advantage of the specific structure of the Lagrangean decomposition. We show empirically that our method outperforms state-of-the-art Lagrangean decomposition based algorithms on some challenging Markov Random Field, multi-label discrete tomography and graph matching problems. AU - Swoboda, Paul AU - Kolmogorov, Vladimir ID - 7468 SN - 10636919 T2 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition TI - Map inference via block-coordinate Frank-Wolfe algorithm VL - 2019-June ER - TY - CONF AB - Deep neural networks (DNNs) have become increasingly important due to their excellent empirical performance on a wide range of problems. However, regularization is generally achieved by indirect means, largely due to the complex set of functions defined by a network and the difficulty in measuring function complexity. There exists no method in the literature for additive regularization based on a norm of the function, as is classically considered in statistical learning theory. In this work, we study the tractability of function norms for deep neural networks with ReLU activations. We provide, to the best of our knowledge, the first proof in the literature of the NP-hardness of computing function norms of DNNs of 3 or more layers. We also highlight a fundamental difference between shallow and deep networks. In the light on these results, we propose a new regularization strategy based on approximate function norms, and show its efficiency on a segmentation task with a DNN. AU - Rannen-Triki, Amal AU - Berman, Maxim AU - Kolmogorov, Vladimir AU - Blaschko, Matthew B. ID - 7639 SN - 9781728150239 T2 - Proceedings of the 2019 International Conference on Computer Vision Workshop TI - Function norms for neural networks ER - TY - CONF AB - The accuracy of information retrieval systems is often measured using complex loss functions such as the average precision (AP) or the normalized discounted cumulative gain (NDCG). Given a set of positive and negative samples, the parameters of a retrieval system can be estimated by minimizing these loss functions. However, the non-differentiability and non-decomposability of these loss functions does not allow for simple gradient based optimization algorithms. This issue is generally circumvented by either optimizing a structured hinge-loss upper bound to the loss function or by using asymptotic methods like the direct-loss minimization framework. Yet, the high computational complexity of loss-augmented inference, which is necessary for both the frameworks, prohibits its use in large training data sets. To alleviate this deficiency, we present a novel quicksort flavored algorithm for a large class of non-decomposable loss functions. We provide a complete characterization of the loss functions that are amenable to our algorithm, and show that it includes both AP and NDCG based loss functions. Furthermore, we prove that no comparison based algorithm can improve upon the computational complexity of our approach asymptotically. We demonstrate the effectiveness of our approach in the context of optimizing the structured hinge loss upper bound of AP and NDCG loss for learning models for a variety of vision tasks. We show that our approach provides significantly better results than simpler decomposable loss functions, while requiring a comparable training time. AU - Mohapatra, Pritish AU - Rolinek, Michal AU - Jawahar, C V AU - Kolmogorov, Vladimir AU - Kumar, M Pawan ID - 273 SN - 9781538664209 T2 - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition TI - Efficient optimization for rank-based loss functions ER - TY - JOUR AB - We consider the recent formulation of the algorithmic Lov ́asz Local Lemma [N. Har-vey and J. Vondr ́ak, inProceedings of FOCS, 2015, pp. 1327–1345; D. Achlioptas and F. Iliopoulos,inProceedings of SODA, 2016, pp. 2024–2038; D. Achlioptas, F. Iliopoulos, and V. Kolmogorov,ALocal Lemma for Focused Stochastic Algorithms, arXiv preprint, 2018] for finding objects that avoid“bad features,” or “flaws.” It extends the Moser–Tardos resampling algorithm [R. A. Moser andG. Tardos,J. ACM, 57 (2010), 11] to more general discrete spaces. At each step the method picks aflaw present in the current state and goes to a new state according to some prespecified probabilitydistribution (which depends on the current state and the selected flaw). However, the recent formu-lation is less flexible than the Moser–Tardos method since it requires a specific flaw selection rule,whereas the algorithm of Moser and Tardos allows an arbitrary rule (and thus can potentially beimplemented more efficiently). We formulate a new “commutativity” condition and prove that it issufficient for an arbitrary rule to work. It also enables an efficient parallelization under an additionalassumption. We then show that existing resampling oracles for perfect matchings and permutationsdo satisfy this condition. AU - Kolmogorov, Vladimir ID - 5975 IS - 6 JF - SIAM Journal on Computing SN - 0097-5397 TI - Commutativity in the algorithmic Lovász local lemma VL - 47 ER - TY - JOUR AB - An N-superconcentrator is a directed, acyclic graph with N input nodes and N output nodes such that every subset of the inputs and every subset of the outputs of same cardinality can be connected by node-disjoint paths. It is known that linear-size and bounded-degree superconcentrators exist. We prove the existence of such superconcentrators with asymptotic density 25.3 (where the density is the number of edges divided by N). The previously best known densities were 28 [12] and 27.4136 [17]. AU - Kolmogorov, Vladimir AU - Rolinek, Michal ID - 18 IS - 10 JF - Ars Combinatoria SN - 0381-7032 TI - Superconcentrators of density 25.3 VL - 141 ER - TY - JOUR AB - The main result of this article is a generalization of the classical blossom algorithm for finding perfect matchings. Our algorithm can efficiently solve Boolean CSPs where each variable appears in exactly two constraints (we call it edge CSP) and all constraints are even Δ-matroid relations (represented by lists of tuples). As a consequence of this, we settle the complexity classification of planar Boolean CSPs started by Dvorak and Kupec. Using a reduction to even Δ-matroids, we then extend the tractability result to larger classes of Δ-matroids that we call efficiently coverable. It properly includes classes that were known to be tractable before, namely, co-independent, compact, local, linear, and binary, with the following caveat:We represent Δ-matroids by lists of tuples, while the last two use a representation by matrices. Since an n ×n matrix can represent exponentially many tuples, our tractability result is not strictly stronger than the known algorithm for linear and binary Δ-matroids. AU - Kazda, Alexandr AU - Kolmogorov, Vladimir AU - Rolinek, Michal ID - 6032 IS - 2 JF - ACM Transactions on Algorithms TI - Even delta-matroids and the complexity of planar boolean CSPs VL - 15 ER - TY - JOUR AB - An instance of the valued constraint satisfaction problem (VCSP) is given by a finite set of variables, a finite domain of labels, and a sum of functions, each function depending on a subset of the variables. Each function can take finite values specifying costs of assignments of labels to its variables or the infinite value, which indicates an infeasible assignment. The goal is to find an assignment of labels to the variables that minimizes the sum. We study, assuming that P 6= NP, how the complexity of this very general problem depends on the set of functions allowed in the instances, the so-called constraint language. The case when all allowed functions take values in f0;1g corresponds to ordinary CSPs, where one deals only with the feasibility issue, and there is no optimization. This case is the subject of the algebraic CSP dichotomy conjecture predicting for which constraint languages CSPs are tractable (i.e., solvable in polynomial time) and for which they are NP-hard. The case when all allowed functions take only finite values corresponds to a finitevalued CSP, where the feasibility aspect is trivial and one deals only with the optimization issue. The complexity of finite-valued CSPs was fully classified by Thapper and Živný. An algebraic necessary condition for tractability of a general-valued CSP with a fixed constraint language was recently given by Kozik and Ochremiak. As our main result, we prove that if a constraint language satisfies this algebraic necessary condition, and the feasibility CSP (i.e., the problem of deciding whether a given instance has a feasible solution) corresponding to the VCSP with this language is tractable, then the VCSP is tractable. The algorithm is a simple combination of the assumed algorithm for the feasibility CSP and the standard LP relaxation. As a corollary, we obtain that a dichotomy for ordinary CSPs would imply a dichotomy for general-valued CSPs. AU - Kolmogorov, Vladimir AU - Krokhin, Andrei AU - Rolinek, Michal ID - 644 IS - 3 JF - SIAM Journal on Computing TI - The complexity of general-valued CSPs VL - 46 ER - TY - CONF AB - The main result of this paper is a generalization of the classical blossom algorithm for finding perfect matchings. Our algorithm can efficiently solve Boolean CSPs where each variable appears in exactly two constraints (we call it edge CSP) and all constraints are even Δ-matroid relations (represented by lists of tuples). As a consequence of this, we settle the complexity classification of planar Boolean CSPs started by Dvorak and Kupec. Knowing that edge CSP is tractable for even Δ-matroid constraints allows us to extend the tractability result to a larger class of Δ-matroids that includes many classes that were known to be tractable before, namely co-independent, compact, local and binary. AU - Kazda, Alexandr AU - Kolmogorov, Vladimir AU - Rolinek, Michal ID - 1192 SN - 978-161197478-2 TI - Even delta-matroids and the complexity of planar Boolean CSPs ER - TY - CONF AB - We consider the problem of estimating the partition function Z(β)=∑xexp(−β(H(x)) of a Gibbs distribution with a Hamilton H(⋅), or more precisely the logarithm of the ratio q=lnZ(0)/Z(β). It has been recently shown how to approximate q with high probability assuming the existence of an oracle that produces samples from the Gibbs distribution for a given parameter value in [0,β]. The current best known approach due to Huber [9] uses O(qlnn⋅[lnq+lnlnn+ε−2]) oracle calls on average where ε is the desired accuracy of approximation and H(⋅) is assumed to lie in {0}∪[1,n]. We improve the complexity to O(qlnn⋅ε−2) oracle calls. We also show that the same complexity can be achieved if exact oracles are replaced with approximate sampling oracles that are within O(ε2qlnn) variation distance from exact oracles. Finally, we prove a lower bound of Ω(q⋅ε−2) oracle calls under a natural model of computation. AU - Kolmogorov, Vladimir ID - 274 T2 - Proceedings of the 31st Conference On Learning Theory TI - A faster approximation algorithm for the Gibbs partition function VL - 75 ER - TY - CONF AB - We study the time-and memory-complexities of the problem of computing labels of (multiple) randomly selected challenge-nodes in a directed acyclic graph. The w-bit label of a node is the hash of the labels of its parents, and the hash function is modeled as a random oracle. Specific instances of this problem underlie both proofs of space [Dziembowski et al. CRYPTO’15] as well as popular memory-hard functions like scrypt. As our main tool, we introduce the new notion of a probabilistic parallel entangled pebbling game, a new type of combinatorial pebbling game on a graph, which is closely related to the labeling game on the same graph. As a first application of our framework, we prove that for scrypt, when the underlying hash function is invoked n times, the cumulative memory complexity (CMC) (a notion recently introduced by Alwen and Serbinenko (STOC’15) to capture amortized memory-hardness for parallel adversaries) is at least Ω(w · (n/ log(n))2). This bound holds for adversaries that can store many natural functions of the labels (e.g., linear combinations), but still not arbitrary functions thereof. We then introduce and study a combinatorial quantity, and show how a sufficiently small upper bound on it (which we conjecture) extends our CMC bound for scrypt to hold against arbitrary adversaries. We also show that such an upper bound solves the main open problem for proofs-of-space protocols: namely, establishing that the time complexity of computing the label of a random node in a graph on n nodes (given an initial kw-bit state) reduces tightly to the time complexity for black pebbling on the same graph (given an initial k-node pebbling). AU - Alwen, Joel F AU - Chen, Binyi AU - Kamath Hosdurg, Chethan AU - Kolmogorov, Vladimir AU - Pietrzak, Krzysztof Z AU - Tessaro, Stefano ID - 1231 TI - On the complexity of scrypt and proofs of space in the parallel random oracle model VL - 9666 ER - TY - JOUR AB - We consider the problem of minimizing the continuous valued total variation subject to different unary terms on trees and propose fast direct algorithms based on dynamic programming to solve these problems. We treat both the convex and the nonconvex case and derive worst-case complexities that are equal to or better than existing methods. We show applications to total variation based two dimensional image processing and computer vision problems based on a Lagrangian decomposition approach. The resulting algorithms are very effcient, offer a high degree of parallelism, and come along with memory requirements which are only in the order of the number of image pixels. AU - Kolmogorov, Vladimir AU - Pock, Thomas AU - Rolinek, Michal ID - 1377 IS - 2 JF - SIAM Journal on Imaging Sciences TI - Total variation on a tree VL - 9 ER - TY - CONF AB - We consider the recent formulation of the Algorithmic Lovász Local Lemma [1], [2] for finding objects that avoid "bad features", or "flaws". It extends the Moser-Tardos resampling algorithm [3] to more general discrete spaces. At each step the method picks a flaw present in the current state and "resamples" it using a "resampling oracle" provided by the user. However, it is less flexible than the Moser-Tardos method since [1], [2] require a specific flaw selection rule, whereas [3] allows an arbitrary rule (and thus can potentially be implemented more efficiently). We formulate a new "commutativity" condition, and prove that it is sufficient for an arbitrary rule to work. It also enables an efficient parallelization under an additional assumption. We then show that existing resampling oracles for perfect matchings and permutations do satisfy this condition. Finally, we generalize the precondition in [2] (in the case of symmetric potential causality graphs). This unifies special cases that previously were treated separately. AU - Kolmogorov, Vladimir ID - 1193 T2 - Proceedings - Annual IEEE Symposium on Foundations of Computer Science TI - Commutativity in the algorithmic Lovasz local lemma VL - 2016-December ER - TY - JOUR AB - We consider Conditional random fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) (Formula presented.) is the sum of terms over intervals [i, j] where each term is non-zero only if the substring (Formula presented.) equals a prespecified pattern w. Such CRFs can be naturally applied to many sequence tagging problems. We present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively (Formula presented.), (Formula presented.) and (Formula presented.) where L is the combined length of input patterns, (Formula presented.) is the maximum length of a pattern, and D is the input alphabet. This improves on the previous algorithms of Ye et al. (NIPS, 2009) whose complexities are respectively (Formula presented.), (Formula presented.) and (Formula presented.), where (Formula presented.) is the number of input patterns. In addition, we give an efficient algorithm for sampling, and revisit the case of MAP with non-positive weights. AU - Kolmogorov, Vladimir AU - Takhanov, Rustem ID - 1794 IS - 1 JF - Algorithmica TI - Inference algorithms for pattern-based CRFs on sequence data VL - 76 ER - TY - CONF AB - Constraint Satisfaction Problem (CSP) is a fundamental algorithmic problem that appears in many areas of Computer Science. It can be equivalently stated as computing a homomorphism R→ΓΓ between two relational structures, e.g. between two directed graphs. Analyzing its complexity has been a prominent research direction, especially for the fixed template CSPs where the right side ΓΓ is fixed and the left side R is unconstrained. Far fewer results are known for the hybrid setting that restricts both sides simultaneously. It assumes that R belongs to a certain class of relational structures (called a structural restriction in this paper). We study which structural restrictions are effective, i.e. there exists a fixed template ΓΓ (from a certain class of languages) for which the problem is tractable when R is restricted, and NP-hard otherwise. We provide a characterization for structural restrictions that are closed under inverse homomorphisms. The criterion is based on the chromatic number of a relational structure defined in this paper; it generalizes the standard chromatic number of a graph. As our main tool, we use the algebraic machinery developed for fixed template CSPs. To apply it to our case, we introduce a new construction called a “lifted language”. We also give a characterization for structural restrictions corresponding to minor-closed families of graphs, extend results to certain Valued CSPs (namely conservative valued languages), and state implications for (valued) CSPs with ordered variables and for the maximum weight independent set problem on some restricted families of graphs. AU - Kolmogorov, Vladimir AU - Rolinek, Michal AU - Takhanov, Rustem ID - 1636 SN - 978-3-662-48970-3 T2 - 26th International Symposium TI - Effectiveness of structural restrictions for hybrid CSPs VL - 9472 ER - TY - JOUR AB - We propose a new family of message passing techniques for MAP estimation in graphical models which we call Sequential Reweighted Message Passing (SRMP). Special cases include well-known techniques such as Min-Sum Diffusion (MSD) and a faster Sequential Tree-Reweighted Message Passing (TRW-S). Importantly, our derivation is simpler than the original derivation of TRW-S, and does not involve a decomposition into trees. This allows easy generalizations. The new family of algorithms can be viewed as a generalization of TRW-S from pairwise to higher-order graphical models. We test SRMP on several real-world problems with promising results. AU - Kolmogorov, Vladimir ID - 1841 IS - 5 JF - IEEE Transactions on Pattern Analysis and Machine Intelligence TI - A new look at reweighted message passing VL - 37 ER - TY - CONF AB - Structural support vector machines (SSVMs) are amongst the best performing models for structured computer vision tasks, such as semantic image segmentation or human pose estimation. Training SSVMs, however, is computationally costly, because it requires repeated calls to a structured prediction subroutine (called \emph{max-oracle}), which has to solve an optimization problem itself, e.g. a graph cut. In this work, we introduce a new algorithm for SSVM training that is more efficient than earlier techniques when the max-oracle is computationally expensive, as it is frequently the case in computer vision tasks. The main idea is to (i) combine the recent stochastic Block-Coordinate Frank-Wolfe algorithm with efficient hyperplane caching, and (ii) use an automatic selection rule for deciding whether to call the exact max-oracle or to rely on an approximate one based on the cached hyperplanes. We show experimentally that this strategy leads to faster convergence to the optimum with respect to the number of requires oracle calls, and that this translates into faster convergence with respect to the total runtime when the max-oracle is slow compared to the other steps of the algorithm. AU - Shah, Neel AU - Kolmogorov, Vladimir AU - Lampert, Christoph ID - 1859 TI - A multi-plane block-coordinate Frank-Wolfe algorithm for training structural SVMs with a costly max-oracle ER - TY - JOUR AB - A class of valued constraint satisfaction problems (VCSPs) is characterised by a valued constraint language, a fixed set of cost functions on a finite domain. Finite-valued constraint languages contain functions that take on rational costs and general-valued constraint languages contain functions that take on rational or infinite costs. An instance of the problem is specified by a sum of functions from the language with the goal to minimise the sum. This framework includes and generalises well-studied constraint satisfaction problems (CSPs) and maximum constraint satisfaction problems (Max-CSPs). Our main result is a precise algebraic characterisation of valued constraint languages whose instances can be solved exactly by the basic linear programming relaxation (BLP). For a general-valued constraint language Γ, BLP is a decision procedure for Γ if and only if Γ admits a symmetric fractional polymorphism of every arity. For a finite-valued constraint language Γ, BLP is a decision procedure if and only if Γ admits a symmetric fractional polymorphism of some arity, or equivalently, if Γ admits a symmetric fractional polymorphism of arity 2. Using these results, we obtain tractability of several novel and previously widely-open classes of VCSPs, including problems over valued constraint languages that are: (1) submodular on arbitrary lattices; (2) bisubmodular (also known as k-submodular) on arbitrary finite domains; (3) weakly (and hence strongly) tree-submodular on arbitrary trees. AU - Kolmogorov, Vladimir AU - Thapper, Johan AU - Živný, Stanislav ID - 2271 IS - 1 JF - SIAM Journal on Computing TI - The power of linear programming for general-valued CSPs VL - 44 ER - TY - CONF AB - An instance of the Valued Constraint Satisfaction Problem (VCSP) is given by a finite set of variables, a finite domain of labels, and a sum of functions, each function depending on a subset of the variables. Each function can take finite values specifying costs of assignments of labels to its variables or the infinite value, which indicates an infeasible assignment. The goal is to find an assignment of labels to the variables that minimizes the sum. We study, assuming that P ≠ NP, how the complexity of this very general problem depends on the set of functions allowed in the instances, the so-called constraint language. The case when all allowed functions take values in {0, ∞} corresponds to ordinary CSPs, where one deals only with the feasibility issue and there is no optimization. This case is the subject of the Algebraic CSP Dichotomy Conjecture predicting for which constraint languages CSPs are tractable (i.e. solvable in polynomial time) and for which NP-hard. The case when all allowed functions take only finite values corresponds to finite-valued CSP, where the feasibility aspect is trivial and one deals only with the optimization issue. The complexity of finite-valued CSPs was fully classified by Thapper and Zivny. An algebraic necessary condition for tractability of a general-valued CSP with a fixed constraint language was recently given by Kozik and Ochremiak. As our main result, we prove that if a constraint language satisfies this algebraic necessary condition, and the feasibility CSP (i.e. the problem of deciding whether a given instance has a feasible solution) corresponding to the VCSP with this language is tractable, then the VCSP is tractable. The algorithm is a simple combination of the assumed algorithm for the feasibility CSP and the standard LP relaxation. As a corollary, we obtain that a dichotomy for ordinary CSPs would imply a dichotomy for general-valued CSPs. AU - Kolmogorov, Vladimir AU - Krokhin, Andrei AU - Rolinek, Michal ID - 1637 TI - The complexity of general-valued CSPs ER - TY - CONF AB - Proofs of work (PoW) have been suggested by Dwork and Naor (Crypto’92) as protection to a shared resource. The basic idea is to ask the service requestor to dedicate some non-trivial amount of computational work to every request. The original applications included prevention of spam and protection against denial of service attacks. More recently, PoWs have been used to prevent double spending in the Bitcoin digital currency system. In this work, we put forward an alternative concept for PoWs - so-called proofs of space (PoS), where a service requestor must dedicate a significant amount of disk space as opposed to computation. We construct secure PoS schemes in the random oracle model (with one additional mild assumption required for the proof to go through), using graphs with high “pebbling complexity” and Merkle hash-trees. We discuss some applications, including follow-up work where a decentralized digital currency scheme called Spacecoin is constructed that uses PoS (instead of wasteful PoW like in Bitcoin) to prevent double spending. The main technical contribution of this work is the construction of (directed, loop-free) graphs on N vertices with in-degree O(log logN) such that even if one places Θ(N) pebbles on the nodes of the graph, there’s a constant fraction of nodes that needs Θ(N) steps to be pebbled (where in every step one can put a pebble on a node if all its parents have a pebble). AU - Dziembowski, Stefan AU - Faust, Sebastian AU - Kolmogorov, Vladimir AU - Pietrzak, Krzysztof Z ID - 1675 SN - 0302-9743 T2 - 35th Annual Cryptology Conference TI - Proofs of space VL - 9216 ER - TY - CONF AB - Energies with high-order non-submodular interactions have been shown to be very useful in vision due to their high modeling power. Optimization of such energies, however, is generally NP-hard. A naive approach that works for small problem instances is exhaustive search, that is, enumeration of all possible labelings of the underlying graph. We propose a general minimization approach for large graphs based on enumeration of labelings of certain small patches. This partial enumeration technique reduces complex high-order energy formulations to pairwise Constraint Satisfaction Problems with unary costs (uCSP), which can be efficiently solved using standard methods like TRW-S. Our approach outperforms a number of existing state-of-the-art algorithms on well known difficult problems (e.g. curvature regularization, stereo, deconvolution); it gives near global minimum and better speed. Our main application of interest is curvature regularization. In the context of segmentation, our partial enumeration technique allows to evaluate curvature directly on small patches using a novel integral geometry approach. AU - Olsson, Carl AU - Ulen, Johannes AU - Boykov, Yuri AU - Kolmogorov, Vladimir ID - 2275 TI - Partial enumeration and curvature regularization ER - TY - CONF AB - Representation languages for coalitional games are a key research area in algorithmic game theory. There is an inher- ent tradeoff between how general a language is, allowing it to capture more elaborate games, and how hard it is computationally to optimize and solve such games. One prominent such language is the simple yet expressive Weighted Graph Games (WGGs) representation (Deng and Papadimitriou 1994), which maintains knowledge about synergies between agents in the form of an edge weighted graph. We consider the problem of finding the optimal coalition structure in WGGs. The agents in such games are vertices in a graph, and the value of a coalition is the sum of the weights of the edges present between coalition members. The optimal coalition structure is a partition of the agents to coalitions, that maximizes the sum of utilities obtained by the coalitions. We show that finding the optimal coalition structure is not only hard for general graphs, but is also intractable for restricted families such as planar graphs which are amenable for many other combinatorial problems. We then provide algorithms with constant factor approximations for planar, minorfree and bounded degree graphs. AU - Bachrach, Yoram AU - Kohli, Pushmeet AU - Kolmogorov, Vladimir AU - Zadimoghaddam, Morteza ID - 2270 TI - Optimal Coalition Structures in Cooperative Graph Games ER - TY - GEN AB - We propose a new family of message passing techniques for MAP estimation in graphical models which we call Sequential Reweighted Message Passing (SRMP). Special cases include well-known techniques such as Min-Sum Diusion (MSD) and a faster Sequential Tree-Reweighted Message Passing (TRW-S). Importantly, our derivation is simpler than the original derivation of TRW-S, and does not involve a decomposition into trees. This allows easy generalizations. We present such a generalization for the case of higher-order graphical models, and test it on several real-world problems with promising results. AU - Vladimir Kolmogorov ID - 2273 TI - Reweighted message passing revisited ER - TY - CONF AB - The problem of minimizing the Potts energy function frequently occurs in computer vision applications. One way to tackle this NP-hard problem was proposed by Kovtun [19, 20]. It identifies a part of an optimal solution by running k maxflow computations, where k is the number of labels. The number of “labeled” pixels can be significant in some applications, e.g. 50-93% in our tests for stereo. We show how to reduce the runtime to O (log k) maxflow computations (or one parametric maxflow computation). Furthermore, the output of our algorithm allows to speed-up the subsequent alpha expansion for the unlabeled part, or can be used as it is for time-critical applications. To derive our technique, we generalize the algorithm of Felzenszwalb et al. [7] for Tree Metrics . We also show a connection to k-submodular functions from combinatorial optimization, and discuss k-submodular relaxations for general energy functions. AU - Gridchyn, Igor AU - Kolmogorov, Vladimir ID - 2276 TI - Potts model, parametric maxflow and k-submodular functions ER - TY - CONF AB - A class of valued constraint satisfaction problems (VCSPs) is characterised by a valued constraint language, a fixed set of cost functions on a finite domain. An instance of the problem is specified by a sum of cost functions from the language with the goal to minimise the sum. We study which classes of finite-valued languages can be solved exactly by the basic linear programming relaxation (BLP). Thapper and Živný showed [20] that if BLP solves the language then the language admits a binary commutative fractional polymorphism. We prove that the converse is also true. This leads to a necessary and a sufficient condition which can be checked in polynomial time for a given language. In contrast, the previous necessary and sufficient condition due to [20] involved infinitely many inequalities. More recently, Thapper and Živný [21] showed (using, in particular, a technique introduced in this paper) that core languages that do not satisfy our condition are NP-hard. Taken together, these results imply that a finite-valued language can either be solved using Linear Programming or is NP-hard. AU - Kolmogorov, Vladimir ID - 2518 IS - 1 TI - The power of linear programming for finite-valued CSPs: A constructive characterization VL - 7965 ER - TY - JOUR AB - We study the complexity of valued constraint satisfaction problems (VCSPs) parametrized by a constraint language, a fixed set of cost functions over a finite domain. An instance of the problem is specified by a sum of cost functions from the language and the goal is to minimize the sum. Under the unique games conjecture, the approximability of finite-valued VCSPs is well understood, see Raghavendra [2008]. However, there is no characterization of finite-valued VCSPs, let alone general-valued VCSPs, that can be solved exactly in polynomial time, thus giving insights from a combinatorial optimization perspective. We consider the case of languages containing all possible unary cost functions. In the case of languages consisting of only {0, ∞}-valued cost functions (i.e., relations), such languages have been called conservative and studied by Bulatov [2003, 2011] and recently by Barto [2011]. Since we study valued languages, we call a language conservative if it contains all finite-valued unary cost functions. The computational complexity of conservative valued languages has been studied by Cohen et al. [2006] for languages over Boolean domains, by Deineko et al. [2008] for {0, 1}-valued languages (a.k.a Max-CSP), and by Takhanov [2010a] for {0, ∞}-valued languages containing all finite-valued unary cost functions (a.k.a. Min-Cost-Hom). We prove a Schaefer-like dichotomy theorem for conservative valued languages: if all cost functions in the language satisfy a certain condition (specified by a complementary combination of STP and MJN multimor-phisms), then any instance can be solved in polynomial time (via a new algorithm developed in this article), otherwise the language is NP-hard. This is the first complete complexity classification of general-valued constraint languages over non-Boolean domains. It is a common phenomenon that complexity classifications of problems over non-Boolean domains are significantly harder than the Boolean cases. The polynomial-time algorithm we present for the tractable cases is a generalization of the submodular minimization problem and a result of Cohen et al. [2008]. Our results generalize previous results by Takhanov [2010a] and (a subset of results) by Cohen et al. [2006] and Deineko et al. [2008]. Moreover, our results do not rely on any computer-assisted search as in Deineko et al. [2008], and provide a powerful tool for proving hardness of finite-valued and general-valued languages. AU - Kolmogorov, Vladimir AU - Živný, Stanislav ID - 2828 IS - 2 JF - Journal of the ACM TI - The complexity of conservative valued CSPs VL - 60 ER - TY - CONF AB - We introduce the M-modes problem for graphical models: predicting the M label configurations of highest probability that are at the same time local maxima of the probability landscape. M-modes have multiple possible applications: because they are intrinsically diverse, they provide a principled alternative to non-maximum suppression techniques for structured prediction, they can act as codebook vectors for quantizing the configuration space, or they can form component centers for mixture model approximation. We present two algorithms for solving the M-modes problem. The first algorithm solves the problem in polynomial time when the underlying graphical model is a simple chain. The second algorithm solves the problem for junction chains. In synthetic and real dataset, we demonstrate how M-modes can improve the performance of prediction. We also use the generated modes as a tool to understand the topography of the probability distribution of configurations, for example with relation to the training set size and amount of noise in the data. AU - Chen, Chao AU - Kolmogorov, Vladimir AU - Yan, Zhu AU - Metaxas, Dimitris AU - Lampert, Christoph ID - 2901 TI - Computing the M most probable modes of a graphical model VL - 31 ER - TY - CONF AB - We consider Conditional Random Fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) x1...xn is the sum of terms over intervals [i,j] where each term is non-zero only if the substring xi...xj equals a prespecified pattern α. Such CRFs can be naturally applied to many sequence tagging problems. We present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively O(nL), O(nLℓmax) and O(nLmin{|D|,log(ℓmax+1)}) where L is the combined length of input patterns, ℓmax is the maximum length of a pattern, and D is the input alphabet. This improves on the previous algorithms of (Ye et al., 2009) whose complexities are respectively O(nL|D|), O(n|Γ|L2ℓ2max) and O(nL|D|), where |Γ| is the number of input patterns. In addition, we give an efficient algorithm for sampling. Finally, we consider the case of non-positive weights. (Komodakis & Paragios, 2009) gave an O(nL) algorithm for computing the MAP. We present a modification that has the same worst-case complexity but can beat it in the best case. AU - Takhanov, Rustem AU - Kolmogorov, Vladimir ID - 2272 IS - 3 T2 - ICML'13 Proceedings of the 30th International Conference on International TI - Inference algorithms for pattern-based CRFs on sequence data VL - 28 ER - TY - GEN AB - Proofs of work (PoW) have been suggested by Dwork and Naor (Crypto'92) as protection to a shared resource. The basic idea is to ask the service requestor to dedicate some non-trivial amount of computational work to every request. The original applications included prevention of spam and protection against denial of service attacks. More recently, PoWs have been used to prevent double spending in the Bitcoin digital currency system. In this work, we put forward an alternative concept for PoWs -- so-called proofs of space (PoS), where a service requestor must dedicate a significant amount of disk space as opposed to computation. We construct secure PoS schemes in the random oracle model, using graphs with high "pebbling complexity" and Merkle hash-trees. AU - Dziembowski, Stefan AU - Faust, Sebastian AU - Kolmogorov, Vladimir AU - Pietrzak, Krzysztof Z ID - 2274 TI - Proofs of Space ER - TY - CONF AB - In this paper we investigate k-submodular functions. This natural family of discrete functions includes submodular and bisubmodular functions as the special cases k = 1 and k = 2 respectively. In particular we generalize the known Min-Max-Theorem for submodular and bisubmodular functions. This theorem asserts that the minimum of the (bi)submodular function can be found by solving a maximization problem over a (bi)submodular polyhedron. We define a k-submodular polyhedron, prove a Min-Max-Theorem for k-submodular functions, and give a greedy algorithm to construct the vertices of the polyhedron. AU - Huber, Anna AU - Kolmogorov, Vladimir ID - 2930 TI - Towards minimizing k-submodular functions VL - 7422 ER - TY - GEN AB - This paper addresses the problem of approximate MAP-MRF inference in general graphical models. Following [36], we consider a family of linear programming relaxations of the problem where each relaxation is specified by a set of nested pairs of factors for which the marginalization constraint needs to be enforced. We develop a generalization of the TRW-S algorithm [9] for this problem, where we use a decomposition into junction chains, monotonic w.r.t. some ordering on the nodes. This generalizes the monotonic chains in [9] in a natural way. We also show how to deal with nested factors in an efficient way. Experiments show an improvement over min-sum diffusion, MPLP and subgradient ascent algorithms on a number of computer vision and natural language processing problems. AU - Kolmogorov, Vladimir AU - Schoenemann, Thomas ID - 2928 T2 - arXiv TI - Generalized sequential tree-reweighted message passing ER - TY - GEN AU - Vladimir Kolmogorov ID - 2929 TI - The power of linear programming for valued CSPs: a constructive characterization ER - TY - JOUR AB - In this paper, we present a new approach for establishing correspondences between sparse image features related by an unknown nonrigid mapping and corrupted by clutter and occlusion, such as points extracted from images of different instances of the same object category. We formulate this matching task as an energy minimization problem by defining an elaborate objective function of the appearance and the spatial arrangement of the features. Optimization of this energy is an instance of graph matching, which is in general an NP-hard problem. We describe a novel graph matching optimization technique, which we refer to as dual decomposition (DD), and demonstrate on a variety of examples that this method outperforms existing graph matching algorithms. In the majority of our examples, DD is able to find the global minimum within a minute. The ability to globally optimize the objective allows us to accurately learn the parameters of our matching model from training examples. We show on several matching tasks that our learned model yields results superior to those of state-of-the-art methods. AU - Torresani, Lorenzo AU - Kolmogorov, Vladimir AU - Rother, Carsten ID - 2931 IS - 2 JF - IEEE Transactions on Pattern Analysis and Machine Intelligence TI - A dual decomposition approach to feature correspondence VL - 35 ER - TY - JOUR AB - We consider the problem of minimizing a function represented as a sum of submodular terms. We assume each term allows an efficient computation of exchange capacities. This holds, for example, for terms depending on a small number of variables, or for certain cardinality-dependent terms. A naive application of submodular minimization algorithms would not exploit the existence of specialized exchange capacity subroutines for individual terms. To overcome this, we cast the problem as a submodular flow (SF) problem in an auxiliary graph in such a way that applying most existing SF algorithms would rely only on these subroutines. We then explore in more detail Iwata's capacity scaling approach for submodular flows (Iwata 1997 [19]). In particular, we show how to improve its complexity in the case when the function contains cardinality-dependent terms. AU - Kolmogorov, Vladimir ID - 3117 IS - 15 JF - Discrete Applied Mathematics TI - Minimizing a sum of submodular functions VL - 160 ER - TY - CONF AB - We study the complexity of valued constraint satisfaction problems (VCSP). A problem from VCSP is characterised by a constraint language, a fixed set of cost functions over a finite domain. An instance of the problem is specified by a sum of cost functions from the language and the goal is to minimise the sum. Under the unique games conjecture, the approximability of finite-valued VCSPs is well-understood, see Raghavendra [FOCS’08]. However, there is no characterisation of finite-valued VCSPs, let alone general-valued VCSPs, that can be solved exactly in polynomial time, thus giving insights from a combinatorial optimisation perspective. We consider the case of languages containing all possible unary cost functions. In the case of languages consisting of only {0, ∞}-valued cost functions (i.e. relations), such languages have been called conservative and studied by Bulatov [LICS’03] and recently by Barto [LICS’11]. Since we study valued languages, we call a language conservative if it contains all finite-valued unary cost functions. The computational complexity of conservative valued languages has been studied by Cohen et al. [AIJ’06] for languages over Boolean domains, by Deineko et al. [JACM’08] for {0,1}-valued languages (a.k.a Max-CSP), and by Takhanov [STACS’10] for {0,∞}-valued languages containing all finite- valued unary cost functions (a.k.a. Min-Cost-Hom). We prove a Schaefer-like dichotomy theorem for conservative valued languages: if all cost functions in the language satisfy a certain condition (specified by a complementary combination of STP and MJN multimorphisms), then any instance can be solved in polynomial time (via a new algorithm developed in this paper), otherwise the language is NP-hard. This is the first complete complexity classification of general-valued constraint languages over non-Boolean domains. It is a common phenomenon that complexity classifications of problems over non-Boolean domains is significantly harder than the Boolean case. The polynomial-time algorithm we present for the tractable cases is a generalisation of the submodular minimisation problem and a result of Cohen et al. [TCS’08]. Our results generalise previous results by Takhanov [STACS’10] and (a subset of results) by Cohen et al. [AIJ’06] and Deineko et al. [JACM’08]. Moreover, our results do not rely on any computer-assisted search as in Deineko et al. [JACM’08], and provide a powerful tool for proving hardness of finite-valued and general-valued languages. AU - Vladimir Kolmogorov AU - Živný, Stanislav ID - 3284 TI - The complexity of conservative valued CSPs ER - TY - JOUR AB - Consider a convex relaxation f̂ of a pseudo-Boolean function f. We say that the relaxation is totally half-integral if f̂(x) is a polyhedral function with half-integral extreme points x, and this property is preserved after adding an arbitrary combination of constraints of the form x i=x j, x i=1-x j, and x i=γ where γ∈{0,1,1/2} is a constant. A well-known example is the roof duality relaxation for quadratic pseudo-Boolean functions f. We argue that total half-integrality is a natural requirement for generalizations of roof duality to arbitrary pseudo-Boolean functions. Our contributions are as follows. First, we provide a complete characterization of totally half-integral relaxations f̂ by establishing a one-to-one correspondence with bisubmodular functions. Second, we give a new characterization of bisubmodular functions. Finally, we show some relationships between general totally half-integral relaxations and relaxations based on the roof duality. On the conceptual level, our results show that bisubmodular functions provide a natural generalization of the roof duality approach to higher-order terms. This can be viewed as a non-submodular analogue of the fact that submodular functions generalize the s-t minimum cut problem with non-negative weights to higher-order terms. AU - Kolmogorov, Vladimir ID - 3257 IS - 4-5 JF - Discrete Applied Mathematics TI - Generalized roof duality and bisubmodular functions VL - 160 ER - TY - CONF AB - We consider the problem of inference in a graphical model with binary variables. While in theory it is arguably preferable to compute marginal probabilities, in practice researchers often use MAP inference due to the availability of efficient discrete optimization algorithms. We bridge the gap between the two approaches by introducing the Discrete Marginals technique in which approximate marginals are obtained by minimizing an objective function with unary and pairwise terms over a discretized domain. This allows the use of techniques originally developed for MAP-MRF inference and learning. We explore two ways to set up the objective function - by discretizing the Bethe free energy and by learning it from training data. Experimental results show that for certain types of graphs a learned function can outperform the Bethe approximation. We also establish a link between the Bethe free energy and submodular functions. AU - Korc, Filip AU - Kolmogorov, Vladimir AU - Lampert, Christoph ID - 3124 TI - Approximating marginals using discrete energy minimization ER - TY - GEN AB - We consider the problem of inference in agraphical model with binary variables. While in theory it is arguably preferable to compute marginal probabilities, in practice researchers often use MAP inference due to the availability of efficient discrete optimization algorithms. We bridge the gap between the two approaches by introducing the Discrete Marginals technique in which approximate marginals are obtained by minimizing an objective function with unary and pair-wise terms over a discretized domain. This allows the use of techniques originally devel-oped for MAP-MRF inference and learning. We explore two ways to set up the objective function - by discretizing the Bethe free energy and by learning it from training data. Experimental results show that for certain types of graphs a learned function can out-perform the Bethe approximation. We also establish a link between the Bethe free energy and submodular functions. AU - Korc, Filip AU - Kolmogorov, Vladimir AU - Lampert, Christoph ID - 5396 SN - 2664-1690 TI - Approximating marginals using discrete energy minimization ER - TY - CHAP AU - Vicente, Sara AU - Vladimir Kolmogorov AU - Rother, Carsten ED - Blake, Andrew ED - Kohli, Pushmeet ED - Rother, Carsten ID - 2922 T2 - Markov Random Fields for Vision and Image Processing TI - Graph-cut Based Image Segmentation with Connectivity Priors ER - TY - CHAP AU - Kumar, M Pawan AU - Vladimir Kolmogorov AU - Torr, Philip H ED - Blake, Andrew ED - Kohli, Pushmeet ED - Rother, Carsten ID - 2923 T2 - Markov Random Fields for Vision and Image Processing TI - Analyzing Convex Relaxations for MAP Estimation ER - TY - CHAP AU - Criminisi, Antonio AU - Cross, Geoffrey AU - Blake, Andrew AU - Vladimir Kolmogorov ED - Blake, Andrew ED - Kohli, Pushmeet ED - Rother, Carsten ID - 2924 T2 - Markov Random Fields for Vision and Image Processing TI - Bilayer Segmentation of Video ER - TY - CHAP AU - Rother, Carsten AU - Vladimir Kolmogorov AU - Boykov, Yuri AU - Blake, Andrew ED - Blake, Andrew ED - Kohli, Pushmeet ED - Rother, Carsten ID - 2925 T2 - Markov Random Fields for Vision and Image Processing TI - Interactive Foreground Extraction using graph cut ER - TY - CHAP AU - Boykov, Yuri AU - Vladimir Kolmogorov ED - Blake, Andrew ED - Kohli, Pushmeet ED - Rother, Carsten ID - 2935 T2 - Markov Random Fields for Vision and Image Processing TI - Basic graph cut algorithms ER - TY - CONF AB - We introduce a new class of functions that can be minimized in polynomial time in the value oracle model. These are functions f satisfying f(x) + f(y) ≥ f(x ∏ y) + f(x ∐ y) where the domain of each variable x i corresponds to nodes of a rooted binary tree, and operations ∏,∐ are defined with respect to this tree. Special cases include previously studied L-convex and bisubmodular functions, which can be obtained with particular choices of trees. We present a polynomial-time algorithm for minimizing functions in the new class. It combines Murota's steepest descent algorithm for L-convex functions with bisubmodular minimization algorithms. AU - Vladimir Kolmogorov ID - 3204 TI - Submodularity on a tree: Unifying Submodularity on a tree: Unifying L-convex and bisubmodular functions convex and bisubmodular functions VL - 6907 ER - TY - CONF AB - In this paper we address the problem of finding the most probable state of discrete Markov random field (MRF) with associative pairwise terms. Although of practical importance, this problem is known to be NP-hard in general. We propose a new type of MRF decomposition, submod-ular decomposition (SMD). Unlike existing decomposition approaches SMD decomposes the initial problem into sub-problems corresponding to a specific class label while preserving the graph structure of each subproblem. Such decomposition enables us to take into account several types of global constraints in an efficient manner. We study theoretical properties of the proposed approach and demonstrate its applicability on a number of problems. AU - Osokin, Anton AU - Vetrov, Dmitry AU - Vladimir Kolmogorov ID - 3206 TI - Submodular decomposition framework for inference in associative Markov networks with global constraints ER - TY - CONF AB - This paper proposes a novel Linear Programming (LP) based algorithm, called Dynamic Tree-Block Coordinate Ascent (DT-BCA), for performing maximum a posteriori (MAP) inference in probabilistic graphical models. Unlike traditional message passing algorithms, which operate uniformly on the whole factor graph, our method dynamically chooses regions of the factor graph on which to focus message-passing efforts. We propose two criteria for selecting regions, including an efficiently computable upper-bound on the increase in the objective possible by passing messages in any particular region. This bound is derived from the theory of primal-dual methods from combinatorial optimization, and the forest that maximizes the bounds can be chosen efficiently using a maximum-spanning-tree-like algorithm. Experimental results show that our dynamic schedules significantly speed up state-of-the-art LP-based message-passing algorithms on a wide variety of real-world problems. AU - Tarlow, Daniel AU - Batra, Druv AU - Kohli, Pushmeet AU - Vladimir Kolmogorov ID - 3205 TI - Dynamic tree block coordinate ascent ER - TY - CONF AB - Cosegmentation is typically defined as the task of jointly segmenting something similar in a given set of images. Existing methods are too generic and so far have not demonstrated competitive results for any specific task. In this paper we overcome this limitation by adding two new aspects to cosegmentation: (1) the "something" has to be an object, and (2) the "similarity" measure is learned. In this way, we are able to achieve excellent results on the recently introduced iCoseg dataset, which contains small sets of images of either the same object instance or similar objects of the same class. The challenge of this dataset lies in the extreme changes in viewpoint, lighting, and object deformations within each set. We are able to considerably outperform several competitors. To achieve this performance, we borrow recent ideas from object recognition: the use of powerful features extracted from a pool of candidate object-like segmentations. We believe that our work will be beneficial to several application areas, such as image retrieval. AU - Vicente, Sara AU - Rother, Carsten AU - Vladimir Kolmogorov ID - 3207 TI - Object cosegmentation ER - TY - JOUR AB - We consider the following problem: given an undirected weighted graph G = (V,E,c) with nonnegative weights, minimize function c(δ(Π))- λ|Π| for all values of parameter λ. Here Π is a partition of the set of nodes, the first term is the cost of edges whose endpoints belong to different components of the partition, and |Π| is the number of components. The current best known algorithm for this problem has complexity O(|V| 2) maximum flow computations. We improve it to |V| parametric maximum flow computations. We observe that the complexity can be improved further for families of graphs which admit a good separator, e.g. for planar graphs. AU - Vladimir Kolmogorov ID - 3202 IS - 4 JF - Algorithmica TI - A faster algorithm for computing the principal sequence of partitions of a graph VL - 56 ER - TY - CONF AB - The problem of cosegmentation consists of segmenting the same object (or objects of the same class) in two or more distinct images. Recently a number of different models have been proposed for this problem. However, no comparison of such models and corresponding optimization techniques has been done so far. We analyze three existing models: the L1 norm model of Rother et al. [1], the L2 norm model of Mukherjee et al. [2] and the "reward" model of Hochbaum and Singh [3]. We also study a new model, which is a straightforward extension of the Boykov-Jolly model for single image segmentation [4]. In terms of optimization, we use a Dual Decomposition (DD) technique in addition to optimization methods in [1,2]. Experiments show a significant improvement of DD over published methods. Our main conclusion, however, is that the new model is the best overall because it: (i) has fewest parameters; (ii) is most robust in practice, and (iii) can be optimized well with an efficient EM-style procedure. AU - Vicente, Sara AU - Vladimir Kolmogorov AU - Rother, Carsten ID - 3201 TI - Cosegmentation revisited: Models and optimization VL - 6312 ER - TY - CONF AU - Vladimir Kolmogorov ID - 2934 TI - Generalized roof duality and bisubmodular functions ER - TY - JOUR AB - We describe a new implementation of the Edmonds’s algorithm for computing a perfect matching of minimum cost, to which we refer as Blossom V. A key feature of our implementation is a combination of two ideas that were shown to be effective for this problem: the “variable dual updates” approach of Cook and Rohe (INFORMS J Comput 11(2):138–148, 1999) and the use of priority queues. We achieve this by maintaining an auxiliary graph whose nodes correspond to alternating trees in the Edmonds’s algorithm. While our use of priority queues does not improve the worst-case complexity, it appears to lead to an efficient technique. In the majority of our tests Blossom V outperformed previous implementations of Cook and Rohe (INFORMS J Comput 11(2):138–148, 1999) and Mehlhorn and Schäfer (J Algorithmics Exp (JEA) 7:4, 2002), sometimes by an order of magnitude. We also show that for large VLSI instances it is beneficial to update duals by solving a linear program, contrary to a conjecture by Cook and Rohe. AU - Vladimir Kolmogorov ID - 2932 IS - 1 JF - Mathematical Programming Computation TI - Blossom V: A new implementation of a minimum cost perfect matching algorithm VL - 1 ER - TY - CONF AB - In recent years the Markov Random Field (MRF) has become the de facto probabilistic model for low-level vision applications. However, in a maximum a posteriori (MAP) framework, MRFs inherently encourage delta function marginal statistics. By contrast, many low-level vision problems have heavy tailed marginal statistics, making the MRF model unsuitable. In this paper we introduce a more general Marginal Probability Field (MPF), of which the MRF is a special, linear case, and show that convex energy MPFs can be used to encourage arbitrary marginal statistics. We introduce a flexible, extensible framework for effectively optimizing the resulting NP-hard MAP problem, based around dual-decomposition and a modified mincost flow algorithm, and which achieves global optimality in some instances. We use a range of applications, including image denoising and texture synthesis, to demonstrate the benefits of this class of MPF over MRFs. AU - Woodford, Oliver J AU - Rother, Carsten AU - Vladimir Kolmogorov ID - 3203 TI - A global perspective on MAP inference for low level vision ER - TY - JOUR AB - The problem of obtaining the maximum a posteriori estimate of a general discrete Markov random field (i.e., a Markov random field defined using a discrete set of labels) is known to be NP-hard. However, due to its central importance in many applications, several approximation algorithms have been proposed in the literature. In this paper, we present an analysis of three such algorithms based on convex relaxations: (i) LP-S: the linear programming (LP) relaxation proposed by Schlesinger (1976) for a special case and independently in Chekuri et al. (2001), Koster et al. (1998), and Wainwright et al. (2005) for the general case; (ii) QP-RL: the quadratic programming (QP) relaxation of Ravikumar and Lafferty (2006); and (iii) SOCP-MS: the second order cone programming (SOCP) relaxation first proposed by Muramatsu and Suzuki (2003) for two label problems and later extended by Kumar et al. (2006) for a general label set. We show that the SOCP-MS and the QP-RL relaxations are equivalent. Furthermore, we prove that despite the flexibility in the form of the constraints/objective function offered by QP and SOCP, the LP-S relaxation strictly dominates (i.e., provides a better approximation than) QP-RL and SOCP-MS. We generalize these results by defining a large class of SOCP (and equivalent QP) relaxations which is dominated by the LP-S relaxation. Based on these results we propose some novel SOCP relaxations which define constraints using random variables that form cycles or cliques in the graphical model representation of the random field. Using some examples we show that the new SOCP relaxations strictly dominate the previous approaches. AU - Kumar, M Pawan AU - Vladimir Kolmogorov AU - Torr, Philip H ID - 3197 JF - Journal of Machine Learning Research TI - An analysis of convex relaxations for MAP estimation of discrete MRFs VL - 10 ER - TY - CONF AB - Many interactive image segmentation approaches use an objective function which includes appearance models as an unknown variable. Since the resulting optimization problem is NP-hard the segmentation and appearance are typically optimized separately, in an EM-style fashion. One contribution of this paper is to express the objective function purely in terms of the unknown segmentation, using higher-order cliques. This formulation reveals an interesting bias of the model towards balanced segmentations. Furthermore, it enables us to develop a new dual decomposition optimization procedure, which provides additionally a lower bound. Hence, we are able to improve on existing optimizers, and verify that for a considerable number of real world examples we even achieve global optimality. This is important since we are able, for the first time, to analyze the deficiencies of the model. Another contribution is to establish a property of a particular dual decomposition approach which involves convex functions depending on foreground area. As a consequence, we show that the optimal decomposition for our problem can be computed efficiently via a parametric maxflow algorithm. AU - Vicente, Sara AU - Vladimir Kolmogorov AU - Rother, Carsten ID - 3199 TI - Joint optimization of segmentation and appearance models ER - TY - JOUR AB - Motivated by various applications to computer vision, we consider the convex cost tension problem, which is the dual of the convex cost flow problem. In this paper, we first propose a primal algorithm for computing an optimal solution of the problem. Our primal algorithm iteratively updates primal variables by solving associated minimum cut problems. We show that the time complexity of the primal algorithm is O (K {dot operator} T (n, m)), where K is the range of primal variables and T (n, m) is the time needed to compute a minimum cut in a graph with n nodes and m edges. We then develop an improved version of the primal algorithm, called the primal-dual algorithm, by making good use of dual variables in addition to primal variables. Although its time complexity is the same as that of the primal algorithm, we can expect a better performance in practice. We finally consider an application to a computer vision problem called the panoramic image stitching. AU - Vladimir Kolmogorov AU - Shioura, Akiyoshi ID - 3200 IS - 4 JF - Discrete Optimization TI - New algorithms for convex cost tension problem with application to computer vision VL - 6 ER - TY - CONF AB - We consider the problem of optimizing multilabel MRFs, which is in general NP-hard and ubiquitous in low-level computer vision. One approach for its solution is to formulate it as an integer linear programming and relax the integrality constraints. The approach we consider in this paper is to first convert the multi-label MRF into an equivalent binary-label MRF and then to relax it. The resulting relaxation can be efficiently solved using a maximum flow algorithm. Its solution provides us with a partially optimal labelling of the binary variables. This partial labelling is then easily transferred to the multi-label problem. We study the theoretical properties of the new relaxation and compare it with the standard one. Specifically, we compare tightness, and characterize a subclass of problems where the two relaxations coincide. We propose several combined algorithms based on the technique and demonstrate their performance on challenging computer vision problems. AU - Kohli, Pushmeet AU - Shekhovtsov, Alexander AU - Rother, Carsten AU - Vladimir Kolmogorov AU - Torr, Philip H ID - 3194 TI - On partial optimality in multi label MRFs ER - TY - JOUR AB - Among the most exciting advances in early vision has been the development of efficient energy minimization algorithms for pixel-labeling tasks such as depth or texture computation. It has been known for decades that such problems can be elegantly expressed as Markov random fields, yet the resulting energy minimization problems have been widely viewed as intractable. Algorithms such as graph cuts and loopy belief propagation (LBP) have proven to be very powerful: For example, such methods form the basis for almost all the top-performing stereo methods. However, the trade-offs among different energy minimization algorithms are still not well understood. In this paper, we describe a set of energy minimization benchmarks and use them to compare the solution quality and runtime of several common energy minimization algorithms. We investigate three promising methods-graph cuts, LBP, and tree-reweighted message passing-in addition to the well-known older iterated conditional mode (ICM) algorithm. Our benchmark problems are drawn from published energy functions used for stereo, image stitching, interactive segmentation, and denoising. We also provide a general-purpose software interface that allows vision researchers to easily switch between optimization methods. The benchmarks, code, images, and results are available at http://vision.middlebury.edu/MRF/. AU - Szeliski, Richard S AU - Zabih, Ramin AU - Scharstein, Daniel AU - Veksler, Olga AU - Vladimir Kolmogorov AU - Agarwala, Aseem AU - Tappen, Marshall F AU - Rother, Carsten ID - 3196 IS - 6 JF - IEEE Transactions on Pattern Analysis and Machine Intelligence TI - A comparative study of energy minimization methods for Markov random fields with smoothness-based priors VL - 30 ER - TY - CONF AB - In this paper we present a new approach for establishing correspondences between sparse image features related by an unknown non-rigid mapping and corrupted by clutter and occlusion, such as points extracted from a pair of images containing a human figure in distinct poses. We formulate this matching task as an energy minimization problem by defining a complex objective function of the appearance and the spatial arrangement of the features. Optimization of this energy is an instance of graph matching, which is in general a NP-hard problem. We describe a novel graph matching optimization technique, which we refer to as dual decomposition (DD), and demonstrate on a variety of examples that this method outperforms existing graph matching algorithms. In the majority of our examples DD is able to find the global minimum within a minute. The ability to globally optimize the objective allows us to accurately learn the parameters of our matching model from training examples. We show on several matching tasks that our learned model yields results superior to those of state-of-the-art methods. AU - Torresani, Lorenzo AU - Vladimir Kolmogorov AU - Rother, Carsten ID - 3198 TI - Feature correspondence via graph matching: Models and global optimization VL - 5303 ER - TY - CONF AB - Graph cut is a popular technique for interactive image segmentation. However, it has certain shortcomings. In particular, graph cut has problems with segmenting thin elongated objects due to the ldquoshrinking biasrdquo. To overcome this problem, we propose to impose an additional connectivity prior, which is a very natural assumption about objects. We formulate several versions of the connectivity constraint and show that the corresponding optimization problems are all NP-hard. For some of these versions we propose two optimization algorithms: (i) a practical heuristic technique which we call DijkstraGC, and (ii) a slow method based on problem decomposition which provides a lower bound on the problem. We use the second technique to verify that for some practical examples DijkstraGC is able to find the global minimum. AU - Vicente, Sara AU - Vladimir Kolmogorov AU - Rother, Carsten ID - 3195 TI - Graph cut based image segmentation with connectivity priors ER - TY - CONF AU - Kumar, M Pawan AU - Vladimir Kolmogorov AU - Torr, Philip H ID - 2933 TI - An Analysis of Convex Relaxations for MAP Estimation ER - TY - CONF AB - Many computer vision applications rely on the efficient optimization of challenging, so-called non-submodular, binary pairwise MRFs. A promising graph cut based approach for optimizing such MRFs known as "roof duality" was recently introduced into computer vision. We study two methods which extend this approach. First, we discuss an efficient implementation of the "probing" technique introduced recently by Boros et al. [5]. It simplifies the MRF while preserving the global optimum. Our code is 400-700 faster on some graphs than the implementation of [5]. Second, we present a new technique which takes an arbitrary input labeling and tries to improve its energy. We give theoretical characterizations of local minima of this procedure. We applied both techniques to many applications, including image segmentation, new view synthesis, superresolution, diagram recognition, parameter learning, texture restoration, and image deconvolution. For several applications we see that we are able to find the global minimum very efficiently, and considerably outperform the original roof duality approach. In comparison to existing techniques, such as graph cut, TRW, BP, ICM, and simulated annealing, we nearly always find a lower energy. AU - Rother, Carsten AU - Vladimir Kolmogorov AU - Lempitsky, Victor AU - Szummer, Martin ID - 3192 TI - Optimizing binary MRFs via extended roof duality ER - TY - CONF AB - The maximum flow algorithm for minimizing energy functions of binary variables has become a standard tool in computer vision. In many cases, unary costs of the energy depend linearly on parameter lambda. In this paper we study vision applications for which it is important to solve the maxflow problem for different lambda's. An example is a weighting between data and regularization terms in image segmentation or stereo: it is desirable to vary it both during training (to learn lambda from ground truth data) and testing (to select best lambda using high-knowledge constraints, e.g. user input). We review algorithmic aspects of this parametric maximum flow problem previously unknown in vision, such as the ability to compute all breakpoints of lambda and corresponding optimal configurations infinite time. These results allow, in particular, to minimize the ratio of some geometric functional, such as flux of a vector field over length (or area). Previously, such functional were tackled with shortest path techniques applicable only in 2D. We give theoretical improvements for "PDE cuts" [5]. We present experimental results for image segmentation, 3D reconstruction, and the cosegmentation problem. AU - Vladimir Kolmogorov AU - Boykov, Yuri AU - Rother, Carsten ID - 3191 TI - Applications of parametric maxflow in computer vision ER - TY - JOUR AB - Optimization techniques based on graph cuts have become a standard tool for many vision applications. These techniques allow to minimize efficiently certain energy functions corresponding to pairwise Markov Random Fields (MRFs). Currently, there is an accepted view within the computer vision community that graph cuts can only be used for optimizing a limited class of MRF energies (e.g., submodular functions). In this survey, we review some results that show that graph cuts can be applied to a much larger class of energy functions (in particular, nonsubmodular functions). While these results are well-known in the optimization community, to our knowledge they were not used in the context of computer vision and MRF optimization. We demonstrate the relevance of these results to vision on the problem of binary texture restoration. AU - Vladimir Kolmogorov AU - Rother, Carsten ID - 3193 IS - 7 JF - IEEE Transactions on Pattern Analysis and Machine Intelligence TI - Minimizing nonsubmodular functions with graph cuts - A review VL - 29 ER - TY - JOUR AB - Stereo vision has numerous applications in robotics, graphics, inspection and other areas. A prime application, one which has driven work on stereo in our laboratory, is teleconferencing in which the use of a stereo webcam already makes possible various transformations of the video stream. These include digital camera control, insertion of virtual objects, background substitution, and eye-gaze correction [9, 8]. AU - Blake, Andrew AU - Criminisi, Antonio AU - Cross, Geoffrey AU - Vladimir Kolmogorov AU - Rother, Carsten ID - 3187 JF - Springer Tracts in Advanced Robotics TI - Fusion of stereo colour and contrast VL - 28 ER - TY - CHAP AB - Most binocular stereo algorithms assume that all scene elements are visible from both cameras. Scene elements that are visible from only one camera, known as occlusions, pose an important challenge for stereo. Occlusions are important for segmentation, because they appear near discontinuities. However, stereo algorithms tend to ignore occlusions because of their difficulty. One reason is that occlusions require the input images to be treated symmetrically, which complicates the problem formulation. Worse, certain depth maps imply physically impossible scene configurations, and must be excluded from the output. In this chapter we approach the problem of binocular stereo with occlusions from an energy minimization viewpoint. We begin by reviewing traditional stereo methods that do not handle occlusions. If occlusions are ignored, it is easy to formulate the stereo problem as a pixel labeling problem, which leads to an energy function that is common in early vision. This kind of energy function can he minimized using graph cuts, which is a combinatorial optimization technique that has proven to be very effective for low-level vision problems. Motivated by this, we have designed two graph cut stereo algorithms that are designed to handle occlusions. These algorithms produce promising experimental results on real data with ground truth. AU - Vladimir Kolmogorov AU - Zabih, Ramin ID - 2921 T2 - Handbook of Mathematical Models in Computer Vision TI - Graph cut algorithms for binocular stereo with occlusions ER - TY - CONF AB - This paper presents an algorithm capable of real-time separation of foreground from background in monocular video sequences. Automatic segmentation of layers from colour/contrast or from motion alone is known to be error-prone. Here motion, colour and contrast cues are probabilistically fused together with spatial and temporal priors to infer layers accurately and efficiently. Central to our algorithm is the fact that pixel velocities are not needed, thus removing the need for optical flow estimation, with its tendency to error and computational expense. Instead, an efficient motion vs non-motion classifier is trained to operate directly and jointly on intensity-change and contrast. Its output is then fused with colour information. The prior on segmentation is represented by a second order, temporal, Hidden Markov Model, together with a spatial MRF favouring coherence except where contrast is high. Finally, accurate layer segmentation and explicit occlusion detection are efficiently achieved by binary graph cut. The segmentation accuracy of the proposed algorithm is quantitatively evaluated with respect to existing ground-truth data and found to be comparable to the accuracy of a state of the art stereo segmentation algorithm. Fore-ground/background segmentation is demonstrated in the application of live background substitution and shown to generate convincingly good quality composite video. AU - Criminisi, Antonio AU - Cross, Geoffrey AU - Blake, Andrew AU - Vladimir Kolmogorov ID - 3189 TI - Bilayer segmentation of live video VL - 1 ER - TY - JOUR AB - Algorithms for discrete energy minimization are of fundamental importance in computer vision. In this paper, we focus on the recent technique proposed by Wainwright et al. (Nov. 2005)- tree-reweighted max-product message passing (TRW). It was inspired by the problem of maximizing a lower bound on the energy. However, the algorithm is not guaranteed to increase this bound - it may actually go down. In addition, TRW does not always converge. We develop a modification of this algorithm which we call sequential tree-reweighted message passing. Its main property is that the bound is guaranteed not to decrease. We also give a weak tree agreement condition which characterizes local maxima of the bound with respect to TRW algorithms. We prove that our algorithm has a limit point that achieves weak tree agreement. Finally, we show that, our algorithm requires half as much memory as traditional message passing approaches. Experimental results demonstrate that on certain synthetic and real problems, our algorithm outperforms both the ordinary belief propagation and tree-reweighted algorithm in (M. J. Wainwright, et al., Nov. 2005). In addition, on stereo problems with Potts interactions, we obtain a lower energy than graph cuts. AU - Vladimir Kolmogorov ID - 3190 IS - 10 JF - IEEE Transactions on Pattern Analysis and Machine Intelligence TI - Convergent tree reweighted message passing for energy minimization VL - 28 ER - TY - CONF AB - We introduce the term cosegmentation which denotes the task of segmenting simultaneously the common parts of an image pair. A generative model for cosegmentation is presented. Inference in the model leads to minimizing an energy with an MRF term encoding spatial coherency and a global constraint which attempts to match the appearance histograms of the common parts. This energy has not been proposed previously and its optimization is challenging and NP-hard. For this problem a novel optimization scheme which we call trust region graph cuts is presented. We demonstrate that this framework has the potential to improve a wide range of research: Object driven image retrieval, video tracking and segmentation, and interactive image editing. The power of the framework lies in its generality, the common part can be a rigid/non-rigid object (or scene), observed from different viewpoints or even similar objects of the same class. AU - Rother, Carsten AU - Vladimir Kolmogorov AU - Minka, Thomas P AU - Blake, Andrew ID - 3188 TI - Cosegmentation of image pairs by histogram matching - Incorporating a global constraint into MRFs ER - TY - CONF AB - One of the most exciting advances in early vision has been the development of efficient energy minimization algorithms. Many early vision tasks require labeling each pixel with some quantity such as depth or texture. While many such problems can be elegantly expressed in the language of Markov Random Fields (MRF's), the resulting energy minimization problems were widely viewed as intractable. Recently, algorithms such as graph cuts and loopy belief propagation (LBP) have proven to be very powerful: for example, such methods form the basis for almost all the top-performing stereo methods. Unfortunately, most papers define their own energy function, which is minimized with a specific algorithm of their choice. As a result, the tradeoffs among different energy minimization algorithms are not well understood. In this paper we describe a set of energy minimization benchmarks, which we use to compare the solution quality and running time of several common energy minimization algorithms. We investigate three promising recent methods - graph cuts, LBP, and tree-reweighted message passing - as well as the well-known older iterated conditional modes (ICM) algorithm. Our benchmark problems are drawn from published energy functions used for stereo, image stitching and interactive segmentation. We also provide a general-purpose software interface that allows vision researchers to easily switch between optimization methods with minimal overhead. We expect that the availability of our benchmarks and interface will make it significantly easier for vision researchers to adopt the best method for their specific problems. Benchmarks, code, results and images are available at http://vision.middlebury.edu/MRF. AU - Szeliski, Richard S AU - Zabih, Ramin AU - Scharstein, Daniel AU - Veksler, Olga AU - Vladimir Kolmogorov AU - Agarwala, Aseem AU - Tappen, Marshall F AU - Rother, Carsten ID - 3180 TI - A comparative study of energy minimization methods for Markov random fields VL - 3952 ER - TY - CONF AB - Algorithms for discrete energy minimization play a fundamental role for low-level vision. Known techniques include graph cuts, belief propagation (BP) and recently introduced tree-reweighted message passing (TRW). So far, the standard benchmark for their comparison has been a 4-connected grid-graph arising in pixel-labelling stereo. This minimization problem, however, has been largely solved: recent work shows that for many scenes TRW finds the global optimum. Furthermore, it is known that a 4-connecled grid-graph is a poor stereo model since it does not take occlusions into account. We propose the problem of stereo with occlusions as a new test bed for minimization algorithms. This is a more challenging graph since it has much larger connectivity, and it also serves as a better stereo model. An attractive feature of this problem is that increased connectivity does not result in increased complexity of message passing algorithms. Indeed, one contribution of this paper is to show that sophisticated implementations of BP and TRW have the same time and memory complexity as that of 4-connecled grid-graph stereo. The main conclusion of our experimental study is that for our problem graph cut outperforms both TRW and BP considerably. TRW achieves consistently a lower energy than BP. However, as connectivity increases the speed of convergence of TRW becomes slower. Unlike 4-connected grids, the difference between the energy of the best optimization method and the lower bound of TRW appears significant. This shows the hardness of the problem and motivates future research. AU - Vladimir Kolmogorov AU - Rother, Carsten ID - 3184 TI - Comparison of energy minimization algorithms for highly connected graphs VL - 3952 LNCS ER - TY - JOUR AB - This paper describes models and algorithms for the real-time segmentation of foreground from background layers in stereo video sequences. Automatic separation of layers from color/contrast or from stereo alone is known to be error-prone. Here, color, contrast, and stereo matching information are fused to infer layers accurately and efficiently. The first algorithm, Layered Dynamic Programming (LDP), solves stereo in an extended six-state space that represents both foreground/background layers and occluded regions. The stereo-match likelihood is then fused with a contrast-sensitive color model that is learned on-the-fly and stereo disparities are obtained by dynamic programming. The second algorithm, Layered Graph Cut (LGC), does not directly solve stereo. Instead, the stereo match likelihood is marginalized over disparities to evaluate foreground and background hypotheses and then fused with a contrast-sensitive color model like the one used in LDP. Segmentation is solved efficiently by ternary graph cut. Both algorithms are evaluated with respect to ground truth data and found to have similar performance, substantially better than either stereo or color/contrast alone. However, their characteristics with respect to computational efficiency are rather different. The algorithms are demonstrated in the application of background substitution and shown to give good quality composite video output. AU - Vladimir Kolmogorov AU - Criminisi, Antonio AU - Blake, Andrew AU - Cross, Geoffrey AU - Rother, Carsten ID - 3185 IS - 9 JF - IEEE Transactions on Pattern Analysis and Machine Intelligence TI - Probabilistic fusion of stereo with color and contrast for bilayer segmentation VL - 28 ER - TY - CONF AB - We introduce a new approach to modelling gradient flows of contours and surfaces. While standard variational methods (e.g. level sets) compute local interface motion in a differential fashion by estimating local contour velocity via energy derivatives, we propose to solve surface evolution PDEs by explicitly estimating integral motion of the whole surface. We formulate an optimization problem directly based on an integral characterization of gradient flow as an infinitesimal move of the (whole) surface giving the largest energy decrease among all moves of equal size. We show that this problem can be efficiently solved using recent advances in algorithms for global hypersurface optimization [4, 2, 11]. In particular, we employ the geo-cuts method [4] that uses ideas from integral geometry to represent continuous surfaces as cuts on discrete graphs. The resulting interface evolution algorithm is validated on some 2D and 3D examples similar to typical demonstrations of level-set methods. Our method can compute gradient flows of hypersurfaces with respect to a fairly general class of continuous functional and it is flexible with respect to distance metrics on the space of contours/surfaces. Preliminary tests for standard L2 distance metric demonstrate numerical stability, topological changes and an absence of any oscillatory motion. AU - Boykov, Yuri AU - Vladimir Kolmogorov AU - Cremers, Daniel AU - Delong, Andrew ID - 3186 TI - An integral solution to surface evolution PDEs via geo cuts VL - 3953 ER - TY - CONF AB - This paper addresses the novel problem of automatically synthesizing an output image from a large collection of different input images. The synthesized image, called a digital tapestry, can be viewed as a visual summary or a virtual 'thumbnail' of all the images in the input collection. The problem of creating the tapestry is cast as a multi-class labeling problem such that each region in the tapestry is constructed from input image blocks that are salient and such that neighboring blocks satisfy spatial compatibility. This is formulated using a Markov Random Field and optimized via the graph cut based expansion move algorithm. The standard expansion move algorithm can only handle energies with metric terms, while our energy contains non-metric (soft and hard) constraints. Therefore we propose two novel contributions. First, we extend the expansion move algorithm for energy functions with non-metric hard constraints. Secondly, we modify it for functions with "almost" metric soft terms, and show that it gives good results in practice. The proposed framework was tested on several consumer photograph collections, and the results are presented. AU - Rother, Carsten AU - Kumar, Sanjiv AU - Vladimir Kolmogorov AU - Blake, Andrew ID - 3175 TI - Digital tapestry VL - 1 ER - TY - CONF AB - This paper demonstrates the high quality, real-time segmentation techniques. We achieve real-time segmentation of foreground from background layers in stereo video sequences. Automatic separation of layers from colour/contrast or from stereo alone is known to be error-prone. Here, colour, contrast and stereo matching information are fused to infer layers accurately and efficiently. The first algorithm, layered dynamic programming (LDP), solves stereo in an extended 6-state space that represents both foreground/background layers and occluded regions. The stereo-match likelihood is then fused with a contrast-sensitive colour model that is learned on the fly, and stereo disparities are obtained by dynamic programming. The second algorithm, layered graph cut (LGC), does not directly solve stereo. Instead the stereo match likelihood is marginalised over foreground and background hypotheses, and fused with a contrast-sensitive colour model like the one used in LDP. Segmentation is solved efficiently by ternary graph cut. Both algorithms are evaluated with respect to ground truth data and found to have similar performance, substantially better than stereo or colour/contrast alone. However, their characteristics with respect to computational efficiency are rather different. The algorithms are demonstrated in the application of background substitution and shown to give good quality composite video output. AU - Vladimir Kolmogorov AU - Criminisi, Antonio AU - Blake, Andrew AU - Cross, Geoffrey AU - Rother, Carsten ID - 3176 TI - Bi-layer segmentation of binocular stereo video ER - TY - CONF AB - This paper describes two algorithms capable of real-time segmentation of foreground from background layers in stereo video sequences. Automatic separation of layers from colour/contrast or from stereo alone is known to be error-prone. Here, colour, contrast and stereo matching information are fused to infer layers accurately and efficiently. The first algorithm, Layered Dynamic Programming (LDP), solves stereo in an extended 6-state space that represents both foreground/background layers and occluded regions. The stereo-match likelihood is then fused with a contrast-sensitive colour model that is learned on the fly, and stereo disparities are obtained by dynamic programming. The second algorithm, Layered Graph Cut (LGC), does not directly solve stereo. Instead the stereo match likelihood is marginalised over foreground and background hypotheses, and fused with a contrast-sensitive colour model like the one used in LDP. Segmentation is solved efficiently by ternary graph cut. Both algorithms are evaluated with respect to ground truth data and found to have similar perfomance, substantially better than stereo or colour/contrast alone. However, their characteristics with respect to computational efficiency are rather different. The algorithms are demonstrated in the application of background substitution and shown to give good quality composite video output. AU - Vladimir Kolmogorov AU - Criminisi, Antonio AU - Blake, Andrew AU - Cross, Geoffrey AU - Rother, Carsten ID - 3183 TI - Bi-layer segmentation of binocular stereo video VL - 2 ER - TY - CONF AB - In the work of the authors (2003), we showed that graph cuts can find hypersurfaces of globally minimal length (or area) under any Riemannian metric. Here we show that graph cuts on directed regular grids can approximate a significantly more general class of continuous non-symmetric metrics. Using submodularity condition (Boros and Hammer, 2002 and Kolmogorov and Zabih, 2004), we obtain a tight characterization of graph-representable metrics. Such "submodular" metrics have an elegant geometric interpretation via hypersurface functionals combining length/area and flux. Practically speaking, we attend 'geo-cuts' algorithm to a wider class of geometrically motivated hypersurface functionals and show how to globally optimize any combination of length/area and flux of a given vector field. The concept of flux was recently introduced into computer vision by Vasilevskiy and Siddiqi (2002) but it was mainly studied within variational framework so far. We are first to show that flux can be integrated into graph cuts as well. Combining geometric concepts of flux and length/area within the global optimization framework of graph cuts allows principled discrete segmentation models and advances the slate of the art for the graph cuts methods in vision. In particular we address the "shrinking" problem of graph cuts, improve segmentation of long thin objects, and introduce useful shape constraints. AU - Vladimir Kolmogorov AU - Boykov, Yuri ID - 3182 TI - What metrics can be approximated by geo cuts or global optimization of length area and flux VL - 1 ER - TY - CONF AB - Tree-reweighted max-product (TRW) message passing [9] is a modified form of the ordinary max-product algorithm for attempting to find minimal energy configurations in Markov random field with cycles. For a TRW fixed point satisfying the strong tree agreement condition, the algorithm outputs a configuration that is provably optimal. In this paper, we focus on the case of binary variables with pairwise couplings, and establish stronger properties of TRW fixed points that satisfy only the milder condition of weak tree agreement (WTA). First, we demonstrate how it is possible to identify part of the optimal solution - i.e., a provably optimal solution for a subset of nodes - without knowing a complete solution. Second, we show that for submodular functions, a WTA fixed point always yields a globally optimal solution. We establish that for binary variables, any WTA fixed point always achieves the global maximum of the linear programming relaxation underlying the TRW method. AU - Vladimir Kolmogorov AU - Wainwright, Martin J ID - 3181 TI - On the optimality of tree reweighted max product message passing ER - TY - JOUR AB - Minimum cut/maximum flow algorithms on graphs have emerged as an increasingly useful tool for exactor approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-flow algorithms with different polynomial time complexity. Their practical efficiency, however, has to date been studied mainly outside the scope of computer vision. The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for applications in vision. We compare the running times of several standard algorithms, as well as a new algorithm that we have recently developed. The algorithms we study include both Goldberg-Tarjan style "push -relabel" methods and algorithms based on Ford-Fulkerson style "augmenting paths." We benchmark these algorithms on a number of typical graphs in the contexts of image restoration, stereo, and segmentation. In many cases, our new algorithm works several times faster than any of the other methods, making near real-time performance possible. An implementation of our max-flow/min-cut algorithm is available upon request for research purposes. AU - Boykov, Yuri AU - Vladimir Kolmogorov ID - 3178 IS - 9 JF - IEEE Transactions on Pattern Analysis and Machine Intelligence TI - An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision VL - 26 ER - TY - JOUR AB - In the last few years, several new algorithms based on graph cuts have been developed to solve energy minimization problems in computer vision. Each of these techniques constructs a graph such that the minimum cut on the graph also minimizes the energy. Yet, because these graph constructions are complex and highly specific to a particular energy function, graph cuts have seen limited application to date. In this paper, we give a characterization of the energy functions that can be minimized by graph cuts. Our results are restricted to functions of binary variables. However, our work generalizes many previous constructions and is easily applicable to vision problems that involve large numbers of labels, such as stereo, motion, image restoration, and scene reconstruction. We give a precise characterization of what energy functions can be minimized using graph cuts, among the energy functions that can be written as a sum of terms containing three or fewer binary variables. We also provide a general-purpose construction to minimize such an energy function. Finally, we give a necessary condition for any energy function of binary variables to be minimized by graph cuts. Researchers who are considering the use of graph cuts to optimize a particular energy function can use our results to determine if this is possible and then follow our construction to create the appropriate graph. A software implementation is freely available. AU - Vladimir Kolmogorov AU - Zabih, Ramin ID - 3173 IS - 2 JF - IEEE Transactions on Pattern Analysis and Machine Intelligence TI - What energy functions can be minimized via graph cuts? VL - 26 ER - TY - JOUR AB - The simultaneous multiple volume (SMV) approach in navigator-gated MRI allows the use of the whole motion range or the entire scan time for the reconstruction of final images by simultaneously acquiring different image volumes at different motion states. The motion tolerance range for each volume is kept small, thus SMV substantially increases the scan efficiency of navigator methods while maintaining the effectiveness of motion suppression. This article reports a general implementation of the SMV approach using a multiprocessor scheduling algorithm. Each motion state is regarded as a processor and each volume is regarded as a job. An efficient scheduling that completes all jobs in minimal time is maintained even when the motion pattern changes. Initial experiments demonstrated that SMV significantly increased the scan efficiency of navigatorgated MRI. AU - Vladimir Kolmogorov AU - Nguyen, Thành D AU - Nuval, Anthony AU - Spincemaille, Pascal AU - Prince, Martin R AU - Zabih, Ramin AU - Wang, Yusu ID - 3172 IS - 2 JF - Magnetic Resonance in Medicine TI - Multiprocessor scheduling implementation of the simultaneous multiple volume SMV navigator method VL - 52 ER - TY - CONF AB - Feature space clustering is a popular approach to image segmentation, in which a feature vector of local properties (such as intensity, texture or motion) is computed at each pixel. The feature space is then clustered, and each pixel is labeled with the cluster that contains its feature vector. A major limitation of this approach is that feature space clusters generally lack spatial coherence (i.e., they do not correspond to a compact grouping of pixels). In this paper, we propose a segmentation algorithm that operates simultaneously in feature space and in image space. We define an energy function over both a set of clusters and a labeling of pixels with clusters. In our framework, a pixel is labeled with a single cluster (rather than, for example, a distribution over clusters). Our energy function penalizes clusters that are a poor fit to the data in feature space, and also penalizes clusters whose pixels lack spatial coherence. The energy function can be efficiently minimized using graph cuts. Our algorithm can incorporate both parametric and non-parametric clustering methods. It can be applied to many optimization-based clustering methods, including k-means and k-medians, and can handle models which are very close in feature space. Preliminary results are presented on segmenting real and synthetic images, using both parametric and non-parametric clustering. AU - Zabih, Ramin AU - Vladimir Kolmogorov ID - 3177 TI - Spatially coherent clustering using graph cuts VL - 2 ER - TY - CONF AB - The problem of efficient, interactive foreground/background segmentation in still images is of great practical importance in image editing. Classical image segmentation tools use either texture (colour) information, e.g. Magic Wand, or edge (contrast) information, e.g. Intelligent Scissors. Recently, an approach based on optimization by graph-cut has been developed which successfully combines both types of information. In this paper we extend the graph-cut approach in three respects. First, we have developed a more powerful, iterative version of the optimisation. Secondly, the power of the iterative algorithm is used to simplify substantially the user interaction needed for a given quality of result. Thirdly, a robust algorithm for "border matting" has been developed to estimate simultaneously the alpha-matte around an object boundary and the colours of foreground pixels. We show that for moderately difficult examples the proposed method outperforms competitive tools. AU - Rother, Carsten AU - Vladimir Kolmogorov AU - Blake, Andrew ID - 3179 IS - 3 TI - "GrabCut" - Interactive foreground extraction using iterated graph cuts VL - 23 ER - TY - CONF AB - Reconstructing a 3-D scene from more than one camera is a classical problem in computer vision. One of the major sources of difficulty is the fact that not all scene elements are visible from all cameras. In the last few years, two promising approaches have been developed 11,12 that formulate the scene reconstruction problem in terms of energy minimization, and minimize the energy using graph cuts. These energy minimization approaches treat the input images symmetrically, handle visibility constraints correctly, and allow spatial smoothness to be enforced. However, these algorithm propose different problem formulations, and handle a limited class of smoothness terms. One algorithm 11 uses a problem formulation that is restricted to two-camera stereo, and imposes smoothness between a pair of cameras. The other algorithm 12 can handle an arbitrary number of cameras, but imposes smoothness only with respect to a single camera. In this paper we give a more general energy minimization formulation for the problem, which allows a larger class of spatial smoothness constraints. We show that our formulation includes both of the previous approaches as special cases, as well as permitting new energy functions. Experimental results on real data with ground truth are also included. AU - Vladimir Kolmogorov AU - Zabih, Ramin AU - Gortler, Steven ID - 3171 TI - Generalized multi camera scene reconstruction using graph cuts VL - 2683 ER - TY - CONF AB - We address visual correspondence problems without assuming that scene points have similar intensities in different views. This situation is common, usually due to non-lambertian scenes or to differences between cameras. We use maximization of mutual information, a powerful technique for registering images that requires no a priori model of the relationship between scene intensities in different views. However, it has proven difficult to use mutual information to compute dense visual correspondence. Comparing fixed-size windows via mutual information suffers from the well-known problems of fixed windows, namely poor performance at discontinuities and in low-texture regions. In this paper, we show how to compute visual correspondence using mutual information without suffering from these problems. Using 'a simple approximation, mutual information can be incorporated into the standard energy minimization framework used in early vision. The energy can then be efficiently minimized using graph cuts, which preserve discontinuities and handle low-texture regions. The resulting algorithm combines the accurate disparity maps that come from graph cuts with the tolerance for intensity changes that comes from mutual information. AU - Kim, Junhwan AU - Vladimir Kolmogorov AU - Zabih, Ramin ID - 3174 TI - Visual correspondence using energy minimization and mutual information VL - 2 ER - TY - CONF AB - Geodesic active contours and graph cuts are two standard image segmentation techniques. We introduce a new segmentation method combining some of their benefits. Our main intuition is that any cut on a graph embedded in some continuous space can be interpreted as a contour (in 2D) or a surface (in 3D). We show how to build a grid graph and set its edge weights so that the cost of cuts is arbitrarily close to the length (area) of the corresponding contours (surfaces) for any anisotropic Riemannian metric. There are two interesting consequences of this technical result. First, graph cut algorithms can be used to find globally minimum geodesic contours (minimal surfaces in 3D) under arbitrary Riemannian metric for a given set of boundary conditions. Second, we show how to minimize metrication artifacts in existing graph-cut based methods in vision. Theoretically speaking, our work provides an interesting link between several branches of mathematics -differential geometry, integral geometry, and combinatorial optimization. The main technical problem is solved using Cauchy-Crofton formula from integral geometry. AU - Boykov, Yuri AU - Vladimir Kolmogorov ID - 3170 TI - Computing geodesics and minimal surfaces via graph cuts VL - 1 ER - TY - CONF AB - In the last few years, several new algorithms based on graph cuts have been developed to solve energy minimization problems in computer vision. Each of these techniques constructs a graph such that the minimum cut on the graph also minimizes the energy. Yet because these graph constructions are complex and highly specific to a particular energy function, graph cuts have seen limited application to date. In this paper we characterize the energy functions that can be minimized by graph cuts. Our results are restricted to energy functions with binary variables. However, our work generalizes many previous constructions, and is easily applicable to vision problems that involve large numbers of labels, such as stereo, motion, image restoration and scene reconstruction. We present three main results: a necessary condition for any energy function that can be minimized by graph cuts; a sufficient condition for energy functions that can be written as a sum of functions of up to three variables at a time; and a general-purpose construction to minimize such an energy function. Researchers who are considering the use of graph cuts to optimize a particular energy function can use our results to determine if this is possible, and then follow our construction to create the appropriate graph. AU - Kolmogorov, Vladimir AU - Zabih, Ramin ID - 2927 SN - 9783540437468 T2 - Proceedings of the 7th European Conference on Computer Vision TI - Multi-camera scene reconstruction via graph cuts ER - TY - CONF AB - Several new algorithms for visual correspondence based on graph cuts [7, 14, 17] have recently been developed. While these methods give very strong results in practice, they do not handle occlusions properly. Specifically, they treat the two input images asymmetrically, and they do not ensure that a pixel corresponds to at most one pixel in the other image. In this paper, we present a new method which properly addresses occlusions, while preserving the advantages of graph cut algorithms. We give experimental results for stereo as well as motion, which demonstrate that our method performs well both at detecting occlusions and computing disparities. AU - Kolmogorov, Vladimir AU - Zabih, Ramin ID - 3169 SN - 0769511430 T2 - Proceedings of the 8th IEEE International Conference on Computer Vision TI - Computing visual correspondence with occlusions using graph cuts VL - 2 ER -