Maximizing Submodular Functions (Part 1)
In this post and the next, I will talk about the problem of maximizing a submodular function. Submodularity is a natural property of set functions, that captures the diminishing returns property. Formally, let be a set function , and let us assume that . Then is submodular if for any sets , and any , we have the inequality
In other words, the marginal value of an element to a set is a non-increasing function of the set . Let’s start with some examples of submodular functions:
- for any concave function .
- Coverage function: we have a set associated with each (so corresponds to sets in a set cover instance). is the number of elements covered by sets in , i.e. . More generally, if we have a non-negative weight associated with each , then we can set to be the total weight of the elements covered by sets in .
- Cut function: is the vertex set of a graph. is the number of edges in the cut .
- Matroid rank function: we have a matroid on the set , and is the size of the largest independent set contained in .
Given a non-negative submodular function, it is natural to ask if we can find the set that maximizes . How is given to us? We assume we have oracle access to the value of the function: given a set , we can compute efficiently. Thus for example, we may have a succinct structure (such as the graph in example 3 above) that allows us to compute quickly. As example 3 captures the MAXCUT problem, this problem at hand is APX hard.
Note that in this abstract form, we have really very little to work with. The algorithm can compute f on a polynomial number of sets, possibly chosen adaptively. The analysis can use non-negativity of f, and decreasing marginals inequality above. On the positive side, this means both the algorithms and the analyses are usually simple and short.
Uri Feige, Vahab Mirrokni and Jan Vondrák studied this general problem in 2007 and showed that one cannot hope to get better than a 1/2 approximation: a better-than-half approximation would require exponentially many queries to the oracle (in fact, Dobzinski and Vondrák recently showed that this is not just an informational bottelneck: even for succinctly represented functions, it is NP hard to get a better-than-half approximation). On the positive side, Feige, Mirrokni and Vondrák (FMV) showed that a random set gives a 1/4 approximation, and gave a beautiful 0.4 approximation algorithm based on smoothed local search. Subsequent works improved the approximation ratio to 0.41 and 0.42.
Today I will sketch the approximation. FMV first prove a simple inequality. Let denote a random set where each element in is chosen with probability . Then they show that for any submodular function , and for any and
Intuitively, this is a concavity type statement. The proof is in fact a very simple induction on : its true for being empty, and it is an easy exercise to do the induction step using submodularity. Using this inequality, we get that for any , and any
Here we have used the fact that conditioned on any value of , the function is submodular so we can apply the above inequality to average over . Similarly if we now use the fact that is submodular and apply the inequality again, we get that the above right hand side is lower bounded by
Armed with this inequality, the expected value of a random set can be easily lower bounded. Let denote the optimal set and let denote its complement. Then a random set is . Thus
Since and are all non-negative, we conclude that a random set has at least as much value as the optimal set. In the case that is symmetric (i.e. for all ), this gives a half approximation, which is the best possible.
With an eye towards the next post, let me define the multilinear relaxation of a submodular function. A subset naturally corresponds to a 0-1 vector in dimensions, so a submodular function can be viewed as a map from the hypercube to the reals. The mulitlinear relaxation, defined by Calinescu, Chekuri, Pál and Vondrák, is an extension of this map to the whole cube . For a vector , the multilinear relxation is defined to be the expectation of , where is a random set defined by picking with probability , independently for each . This has several useful properties and turns out to be “right” extension of for submodular maximization problems. The concavity type statement above is in fact a special case of a more general concavity property of (see this paper for a proof). In the next post, we will see a recent improvement to the above results, and this relaxation will come in handy.