# Maximizing Submodular Functions (Part 2)

Continuing on my last post, today I will talk about recent work by Niv Buchbinder, Moran Feldman, Seffi Naor, and Roy Schwartz that gives a simple 1/2 approximation to the (unconstrained) submodular maximization problem, matching the hardness. Do see the paper (which should be available in a couple of weeks) for full details. Apologies in advance for any errors.

First consider the greedy algorithm that starts with the empty set, and keeps adding the most profitable item, as long as there is an improvement in the objective. For most variants of set cover type problems, such algorithms give the best possible approximation guarantees.

Unfortunately, it is not hard to construct set functions where the approximation ratio of greedy is unbounded. How about *reverse greedy*: starting with everything and throwing out elements while we improve? The function defined as is also submodular when is, and running the “reverse greedy” on is same as running greedy on . Thus a bad example for greedy can be transformed into one for reverse greedy.

The first idea that BFNS try is to do both greedy and reverse greedy, in a correlated way. Let’s start with and . In the ith step, we will make and agree in the ith coordinate, by either adding to or deleting from . A greedy approach to doing this is to add to if the marginal benefit exceeds . It can also by shown using the definition of submodularity that the sum is non-negative, so at least one of the sets improves in value. In steps, we have and that is the output of the algorithm.

Is this “coupled greedy algorithm” any good? BFNS first show that this gives a 1/3 approximation. Towards a short proof, here is a little more notation. Let be the optimal set. Let denote the tail set and let denote the part of in the tail. Let and denote the sets and at the end of iteration . Thus and . For simplicity of presentation, we will assume that .

Now here is the crucial definition. Consider the hybrid set . Thus we have and is the output of the algorithm. We don’t know but we know . In the ith step, we learn a little more about our set , and let denote the quantity . The total loss is . We will show that the total loss is at most . This would imply that claimed approximation ratio.

Suppose that in step , so that we add . We claim that in this case, . Two cases: if is in , then is zero since . On the other hand . If is not in , then . Thus is the marginal value of with respect to . Since , this marginal value is at least , which in turn means that . Summed over all iterations where , the total loss is then at most the sum of ‘s over these iterations. This is exactly .

An essentially identical argument shows that the sum over iterations when is also at most . This finishes the proof.

But 1/3 is not what I promised above. How do we improve on this? The answer to this question was in the last post: we will use the multilinear relaxation, which extends the defintion of a submodular function to the whole of . BFNS modify the above algorithm to do the coupled walks in instead of on . Start with and . In the ith step, we will make and equal, so that and agree in n steps. Defining as and as , we make an agree as follows. If only one of and is positive, the choice is obvious: we set in the first case, and make them both zero in the second case. In case they are both positive, it is natural to interpolate linearly: we set . Instead of the sets , above, we will have their fractional counterparts: vectors . Here is defined as being equal to the optimum on the tail, and equal to on the first $i$ coordinates.

The proof is now very similar to the previous case, except that we save a factor of two. The following inequality implies that the total loss is only , which will lead to a factor :

Proving this is not that hard once we know that this is what is to be proved. In the case that one of or is negative, this is fairly straightforward given the analysis in the approximation above. In the case that they are both non-negative, we need to compute the various terms. It is easy to check that in this case is equal to , and that is equal to . The quantity can be upper bounded, again using arguments similar to those above, by . Since , we are done.

Thanks for the post. For the 1/2 approximation, when calculating $a_i, b_i$, we need to calculate $F(x)$ for some fractional $x$. But when $x$ is fractional, do we need exponential number of calls to the oracle?

Good catch. I tried to gloss over this, but this is indeed an issue that has to be handled. Exactly computing F would indeed require an exponential number of calls. So one uses an estimate of F based on a small number of samples and argues that this suffices. This can be done and all works that use the multilinear relaxation have to deal with this. Here is one approach: using Chernoff bounds, you can show that if you take polynomially many random samples and average to estimate F, the additive error is at most M/poly(n) except with negilgible probability, where M is an upper bound on f. Intuitively, if an a_i or a b_i above is large, our estimates will be fairly accurate; if they are really small, we make a mistake, but then making a mistake shouldn’t effect us so much. One should be able to modify the analysis slightly to make this formal and argue that if we run the algorithm with these estimates with small additive errors, the final result is still a (1/2 – 1/poly) approximation.

Thanks for your reply. Also, for the 1/3-approximation proof, when $a_i \ge b_i$ and $i \not \in O$, I feel $-l_i$ should be at least $-b_i$ instead of $b_i$ since $-b_i$ is the marginal value of $i$ respect to $Y_i – i$. Also in this case, we can still say $l_i \le b_i \le a_i$.