Skip to content

Quick comments on the NIPS experiment

December 18, 2014

[One can tell it’s reviewing and letter-writing season when I escape to blogging more often..]

There’s been some discussion on the NIPS experiment, enough of it that even my neuro-scientist brother sent me a link to Eric Price’s blog post. The gist of it is that the program chairs duplicated the reviewing process for 10% of the papers, to see how many would get inconsistent decisions, and it turned out that 25.9% of them did (one of the program chairs predicted that it would be 20% and the other that it would be 25%, see also herehere and here). Eric argues that the right way to measure disagreement is to look at the fraction of papers that process A accepted that would have been rejected by process B, which comes out to more than 50%.

It’s hard for me to interpret this number. One interpretation is that it’s a failure of the refereeing process that people can’t agree on more than half of the list of accepted papers. Another viewpoint is that since the disagreement is not much larger than predicted beforehand, we shouldn’t be that surprised about it. It’s tempting when having such discussions to model papers as having some inherent quality score, where the goal of the program committee is to find all papers above a certain threshold. The truth is that different papers have different, incomparable qualities, that appeal to different subsets of people. The goal of the program committee is to curate an a diverse and intellectually stimulating program for the conference. This is an inherently subjective task, and it’s not surprising that different committees would arrive at different conclusions. I do not know what’s the “optimal” amount of variance in this process, but I would have been quite worried if it was zero, since it would be a clear sign of groupthink. Lastly, I think this experiment actually points out to an important benefit of the conference system. Unlike journals, where the editorial board tends to stay constant for a long period, in conferences one gets a fresh draw of the committee every 6 months or a year.

An observation

December 15, 2014

Last Friday in our theory reading group, Yael Kalai observed that there’s only one other woman in the room. She noticed it because in cryptography meetings, at least in the Boston area, there is a significantly higher female presence. Make no mistake, cryptography, even in Boston, still has a very lopsided gender ratio. But I think it is still a bit better than some of the other areas of theoretical computer science, largely due to a few strong role models such as Shafi Goldwasser. The gender ratio in TCS is sometimes thought of as an unfortunate but constant fact of life, that is due to larger forces in society beyond our control. Examples such as this show that actions by small communities or even individuals can make a difference.

I truly believe that theoretical computer science, and computer science at large, will grow significantly in importance over the coming decades, as it occupies a much more central place in many of the sciences and other human activities. We’ll have many more problems to solve, and we can’t do it without using half of the world’s talent pool.

Sum-of-Squares seminar: lecture notes and open problems

December 8, 2014

I just gave the final lecture in my seminar on Sum of Squares Upper Bounds, Lower Bounds, and Open Questions. (see also this previous post). The lectures notes are available on the web page and also as a single pdf file.  They are extremely rough but I hope they would still be useful, as some of this material is not covered (to my knowledge) elsewhere. (Indeed two of the papers we covered are “hot off the press”: the work of  Meka-Potechin-WIgderson on the planted clique problem hasn’t yet been posted online, and the work of Lee-Raghavendra-Steurer on semidefinite extension complexity was just posted online two weeks ago.)

Some of the topics we covered included the SDP based algorithms for problems such as Max-Cut, Sparsest-Cut, and Small-Set Expansion, lower bounds for Sum-of-Squares: 3XOR/3SAT and planted clique, using SOS for unsupervised learning, how might (and of course also might not) the SOS algorithm be used refute the Unique Games Conjecture, linear programming and semidefinite programming extension complexity.

Thanks to everyone that participated in the course and in particular to the students who scribed the notes (both in this course and my previous summer course) and to Jon Kelner and Ankur Moitra that gave guest lectures!

I thought I might post here the list of open problems I ended with- feel free to comment on those or add some of your own below.

In most cases I phrased the problem as asking to show a particular statement, though of course showing the opposite statement would be very interesting as well. This is not meant to be a complete or definitive list, but could perhaps spark your imagination to think of those or other research problems of your own. The broader themes these questions are meant to explore are:

  • Can we understand in what cases do SOS programs of intermediate degree (larger than {2} but much smaller than {n}) yield non-trivial guarantees?
    It seems that for some problems (such as 3SAT) the degree/quality curve has a threshold behavior, where we need to get to degree roughly \Omega(n) to beat the performance of the degree 1 algorithm, while for other problems (such as UNIQUE GAMES) the degree/quality curve seems much smoother, though we don’t really understand it.
    Understanding this computation/quality tradeoff in other settings, such as average case complexity, would be very interesting as well for areas such as learning, statistical physics, and cryptography.
  • Can we give more evidence to, or perhaps refute, the intuition that the SOS algorithm is optimal in some broad domains?
  • Can we understand the performance of SOS in average-case setting, and whether there are justifications to consider it optimal in this setting as well? This is of course interesting for both machine learning and cryptography.
  • Can we understand the role of noise in the performance of the SOS algorithm? Is noise a way to distinguish between “combinatorial” and “algebraic” problems in the sense of my previous post?

Well posed problems

Problem 1

Show that for every constant {C} there is some {\delta>0} and a quasipolynomial ({n^{polylog(n)}}) time algorithm that on input a subspace {V\subseteq \mathbb{R}^n}, can distinguish between the case that {V} contains the characteristic vector of a set of measure at most {\delta}, and the case that {\mathbb{E}_i v_i^4 \leq C (\mathbb{E}_i v_i^2)^2} for every {v\in V}. Extend this to a quasipolynomial time algorithm to solve the small-set expansion problem (and hence refute the small set expansion hypothesis). Extend this to a quasipolynomial time algorithm to solve the unique-games problem (and hence refute the unique games conjecture). If you think this cannot be done then even showing that the {d=\log^2 n} (in fact, even {d=10}) SOS program does not solve the unique-games problem (or the {4/2} norms ratio problem as defined above) would be very interesting.

Problem 2

Show that there is some constant {d} such that the degree-{d} SOS problem can distinguish between a random graph and a graph in which a clique of size {f(n)} was planted for some {f(n)=o(\sqrt{n})}, or prove that this cannot be done. Even settling this question for {d=4} would be very interesting.

Problem 3

Show that the SOS algorithm is optimal in some sense for “pseudo-random” constraint satisfaction problems, by showing that for every predicate {P:\{0,1\}^k\rightarrow \{0,1\}}, {\epsilon>0} and pairwise independent distribution {\mu} over {\{0,1\}^k}, it is NP hard to distinguish, given an instance of MAX-{P} (i.e., a set of constraints each of which corresponds to applying {P} to {k} literals of some Boolean variables {x_1,\ldots,x_n}), between the case that one can satisfy {1-\epsilon} fraction of the constraints, and the case that one can satisfy at most {\mathbb{E}_{x\sim \mu} P(x) + \epsilon} fraction of them. (In a recent, yet unpublished, work with Chan and Kothari, we show that small degree SOS programs cannot distinguish between these two cases.)

Problem 4

More generally, can we obtain a “UGC free Raghavendra Theorem”? For example, can we show (without relying on the UGC) that for every predicate {P:\{0,1\}^k\rightarrow\{0,1\}}, {c>s} and {\epsilon>0}, if there is an {n}-variable instance of MAX-{P} whose value is at most {s} but on which the {\Omega(n)} degree SOS program outputs at least {c}, then distinguishing between the case that a CSP-{P} instance as value at least {c-\epsilon} and the case that it has value at most {s+\epsilon} is NP-hard?

Problem 5

Show that there is some {\eta>1/2} and {\delta<1} such that for sufficiently small {\epsilon>0}, the degree {n^{\delta}} SOS program for Max-Cut can distinguish, given a graph {G}, between the case that {G} has a cut of value {1-\epsilon} and the case that {G} has a cut of value {1-\epsilon^{\eta}}. (Note that Kelner and Parrilo have a conjectured approach to achieve this.) Can you do this with arbitrarily small {\delta>0}?

Problem 6

If you think the above cannot be done, even showing that the degree {d=10} (or even better, {d=\log^2 n}) SOS program cannot achieve this, even for the more general Max-2-LIN problem, would be quite interesting. As an intermediate step, settle Khot-Moshkovitz’s question whether for an arbitrarily large constant {c}, the Max-2-LIN instance they construct (where the degree {d} (for some constant {d}) SOS value is {1-\epsilon}) has actual value at most {1-c\epsilon}. Some intermediate steps that could be significantly easier are: the Khot-Moshkovitz construction is a reduction from a {k}-CSP on {N} variables that first considers all {n}-sized subsets of the {N} original variables and then applies a certain encoding to each one of those {\binom{N}{n}} “cloud”. Prove that if this is modified to a single N-sized cloud then the reduction would be “sound” in the sense that there would be no integral solution of value larger than {1-c\epsilon}. (This should be significantly easier to prove than the soundness of the Khot-Moshkovitz construction since it completely does away with their consistency test; still to my knowledge it is not proven in their paper. The reduction will not be “complete” in this case, since it will have more than exponential blowup and will not preserve SOS solutions but I still view this as an interesting step. Also if this step is completed, perhaps one can think of other ways than the “cloud” approach of KM to reduce the blowup of this reduction to {2^{\delta N}} for some small {\delta>0}, maybe a “biased” version of their code could work as well.)
The following statement, if true, demonstrates one of the challenges in proving the soundness of KM construction: Recall that the KM boundary test takes a function {f:\mathbb{R}^n\rightarrow \{ \pm 1\}} and checks if {f(x)=f(y)} where {x} and {y} have standard Gaussian coordinates that are each {1-\alpha} correlated for some {\alpha \ll 1/n}. Their intended solution {f(x) = (-1)^{\lfloor \langle a,x \rangle \rfloor}} for {a\in\{\pm 1\}^n} will fail the test with probability {O(\sqrt{\alpha n})}. Prove that there is a function {f} that passes the test with {c \sqrt{\alpha n}} for some {c} but such that for every constant {d} and function {g} of the form {g(x) = (-1)^{\lfloor p(x) \rfloor}} where {p} a polynomial of degree at most {d}, {|\mathbb{E} p(x)f(x) | = o(1/n)}.

Problem 7

Show that there are some constant {\eta<1/2} and {d}, such that the degree {d}-SOS program yields an {O(\log^\eta n)} approximation to the Sparsest Cut problem. If you think this can’t be done, even showing that the {d=8} algorithm doesn’t beat {O(\sqrt{\log n})} would be very interesting.

Problem 8

Give a polynomial-time algorithm that for some sufficiently small {\epsilon>0}, can (approximately) recover a planted {\epsilon n}-sparse vector {v_0} inside a random subspace {V\subseteq \mathbb{R}^n} of dimension {\ell=n^{0.6}}. That is, we choose {v_1,\ldots,v_{\ell}} as random Gaussian vectors, and the algorithm gets an arbitrary basis for the span of {\{v_0,v_1,\ldots,v_{\ell}\}}. Can you extend this to larger dimensions? Can you give a quasipolynomial time algorithm that works when {V} has dimension {\Omega(n)}? Can you give a quasipolynomial time algorithm for certifying the Restricted Isometry Property (RIP) of a random matrix?

Problem 9

Improve the dictionary learning algorithm of [Barak-Kelner-Steurer] (in the setting of constant sparsity) from quasipolynomial to polynomial time.

Problem 10

(Suggested by Prasad Raghavendra.) Can SDP relaxations simulate local search?
While sum of squares SDP relaxations yield the best known approximations for CSPs, the same is not known for bounded degree CSPs. For instance, MAXCUT on bounded degree graphs can be approximated better than the Goemans-Willamson constant {0.878..} via a combination of SDP rounding and local search. Here local search refers to improving the value of the solution by locally modifying the values. Show that for every constant {\Delta}, there is some {\epsilon>0, d\in\mathbb{N}} such that {d} rounds of SOS yield an {0.878.. + \epsilon} approximation for MAXCUT on graphs of maximum degree {\Delta}. Another problem to consider is maximum matching in 3-uniform hypergraphs. This can be approximated to a 3/4 factor using only local search (no LP/SDP relaxations), and some natural relaxations have a 1/2 integrality gap for it. Show that for every {\epsilon>0}, {O(1)} rounds of SOS give a {3/4-\epsilon} approximation for this problem, or rule this out via an integrality gap.

[Update: Prasad notes that the first problem for Max-Cut actually is solved as stated, since the paper also shows that the SDP integrality gap is better than  {0.878..} for Max-Cut on bounded degree graphs. The second question is still open to my knowledge, and more generally understanding if SOS can always simulate local search.]

Problem 11

(Suggested by Ryan O’Donnell) Let {G} be the {n} vertex graph on {\{0,1\ldots,n-1\}} where we connect every two vertices {i,j} such that their distance (mod {n}) is at most {\Delta} for some constant {\Delta}. The set {S} of {n/2} vertices with least expansion is an arc. Can we prove this with an SOS proof of constant (independent of {\Delta}) degree? For every {\delta>0} there is a {c} such that if we let {G} be the graph with {n=2^\ell} vertices corresponding to {\{0,1\}^\ell} where we connect vertices {x,y} if their Hamming distance is at most {c\sqrt{n}}, then for every subsets {A,B} of {\{0,1\}^\ell} satisfying {|A|,|B| \geq \delta n}, there is an edge between {A} and {B}. Can we prove this with an SOS proof of constant degree?

[Update: William Perry showed a degree 4 proof (using the triangle inequality) for the fact that the least expanding sets a power of the cycle. Sangxia Huang proved independently a similar result.]

Fuzzier problems

The following problems are not as well-defined, but this does not mean they are less important.

Problem A

Find more problems in the area of unsupervised learning where one can obtain an efficient algorithm by giving a proof of identifiability using low degree SOS.

Problem B

The notion of pseudo-distributions gives rise to a computational analog of Bayesian reasoning about the knowledge of a computationally-bounded observer. Can we give any interesting applications of this? Perhaps in economics? Or cryptography?

SOS, Cryptography, and {\mathbf{NP}\cap\mathbf{coNP}}It sometimes seems as if in the context of combinatorial optimization it holds that “{\mathbf{NP}\cap\mathbf{coNP}=\mathbf{P}}”, or in other words that all proof systems are automatizable. Can the SOS algorithm give any justification to this intuition? In contrast note that we do not believe that this assertion is actually true in general. Indeed, many of our candidates for public key encryption (though not all— see discussion in [Applebaum,Barak, Wigderson]) fall inside {\mathbf{NP}\cap\mathbf{coNP}} (or {\mathbf{AM}\cap\mathbf{coAM}}). Can SOS shed any light on this phenonmenon? A major issue in cryptography is (to quote Adi Shamir) the lack of diversity in the “gene pool” of problems that can be used as a basis for public key encryption. If quantum computers are built, then essentially the only well-tested candidates are based on a single problem— Regev’s “Learning With Errors” (LWE) assumption (closely related to various problems on integer lattices). Some concrete questions along these lines are:

Problem C

Find some evidence to the conjecture of  Barak-Kindler-Steurer  (or other similar conjectures) that the SOS algorithm might be optimal even in an average case setting. Can you find applications for this conjecture in cryptography?

Problem D

Can we use a conjectured optimality of SOS to give public key encryption schemes? Perhaps to justify the security of LWE? One barrier for the latter could be that breaking LWE and related lattice problems is in fact in {\mathbf{NP}\cap\mathbf{coNP}} or {\mathbf{AM}\cap\mathbf{coAM}}.

Problem E

Understand the role of noise in the performance of the SOS algorithm. The algorithm seems to be inherently noise robust, and it also seems that this is related to both its power and its weakness– as is demonstrated by cases such as solving linear equations where it cannot get close to the performance of the Gaussian elimination algorithm, but the latter is also extremely sensitive to noise.
Can we get any formal justifications to this intuition? What is the right way to define noise robustness in general? If we believe that the SOS algorithm is optimal (even in some average case setting) for noisy problems, can we get any quantitative predictions to the amount of noise needed for this to hold? This may be related to the question above of getting public key cryptography from assuming the optimality of SOS in the average case (see Barak-Kindler-Steurer and Applebaum-Barak-Wigderson).

Problem F

Related to this: is there a sense in which SOS is an optimal noise-robust algorithm or proof system? Are there natural stronger proof systems that are still automatizable (maybe corresponding to other convex programs such as hyperbolic programming, or maybe using a completely different paradigm)? Are there natural noise-robust algorithms for combinatorial optimizations that are not captured by the
SOS framework? Are there natural stronger proof systems than SOS (even non
automatizable ones) that are noise-robust and are stronger than SOS for natural combinatorial optimization problems?
Can we understand better the role of the feasible interpolation property in this context?

Problem G

I have suggested that the main reason that a “robust” proof does not translate into an SOS proof is by use of the probabilistic method, but this is by no means a universal law and getting better intuition as to what types of arguments do and don’t translate into low degree SOS proofs is an important research direction. Ryan O’Donnell’s problems above present one challenge to this viewpoint. Another approach is to try to use techniques from derandomization such as use of additive combinatorics or the Zig-Zag product to obtain “hard to SOS” proofs. In particular, is there an SOS proof that the graph constructed by Capalbo, Reingold, Vadhan and Wigderson (STOC 2002) is a “lossless expander” (expansion larger than {degree/2})? Are there SOS proofs for the pseudorandom properties of the condensers we construct in the work with Impagliazzo and Wigderson (FOCS 2004, SICOMP 2006) or other constructions using additive combinatorics? I would suspect the answer might be “no”. (Indeed, this may be related to the planted clique question, as these tools were used  to construct the best known Ramsey graphs.)

Out the Window

November 24, 2014

The closing of MSR-SV two months ago raised a fair bit of discussion, and I would like to contribute some of my own thoughts. Since the topic of industrial research is important, I would like the opportunity to counter some misconceptions that have spread. I would also like to share my advice with anyone that (like me) is considering an industrial research position (and anyone that already has one).


On Thursday 09/18/2014, an urgent meeting was announced for all but a few in MSR-SV. The short meeting marked the immediate closing of the lab. By the time the participants came back to their usual building, cardboard boxes were waiting for the prompt packing of personal items (to be evacuated by the end of that weekend). This harsh style of layoffs was one major cause for shock and it indeed seemed unprecedented for research labs of this sort. But I find the following much more dramatic: Microsoft, like many other big companies, frequently evaluates its employees. A group of researchers that were immensely valuable according to Microsoft’s own metric just a couple of months before were thrown out to the hands of Microsoft’s competitors that were more than happy to oblige. Similarly, previously valued research projects were carelessly lost (quite possibly to be picked up by others). Excellence as defined by Microsoft did not protect you, impact did not protect you (among the positions eliminated were researchers that saved Microsoft ridiculously large sums of money, enough to pay for the entire lab for many years). Since Microsoft is publicly claiming “business as usual” (which should mean that the evaluation metric didn’t change), and since Microsoft was performing a relatively moderate force reduction (of roughly 5% of its non-Nokia workforce), I still find it all very hard to understand.

Why MSR-SV and Why not?

It is my opinion that no substantial explanation for the closing was given by Microsoft’s representatives to the general public and (as far as I have been told) to current Microsoft employees. In the absence of reliable official explanation, rumors and speculations flourished. What should be made absolutely clear is that MSR-SV was not closed for lack of impact. The lab had huge impact in all dimensions including impact measured in dollars.

It is true that some cannot understand how the academic-style model of MSR-SV could be beneficial for a company. But it seems amateurish to base business decisions on perception rather than reality. Indeed, previous management of MSR and Microsoft resisted pressures from outside of Microsoft to change MSR. The current management seems to be changing course.

This is indeed my own speculation – MSR is changing its nature and therefore chose to close the lab that embodied in the purest form what MSR is moving away from, sending a strong internal and external signal. I don’t know that this is the case, but any other explanation I heard characterizes parts of the management of MSR and Microsoft as either incompetent or malicious. There is every reason to believe that these are all bright individuals, and that the decision was carefully weighed (taking into account all the obvious internal and external consequences). I only wish they would own up to it.

Don’t Call it the “MSR Model “

There was a lot of discussion about the MSR model vs. the model of other industrial research labs. This is somewhat misguided: MSR is very large and hosts a lot of models. This is absolutely fine – a company like Microsoft has the need for all sorts of research, and different talents need different outlets. But this also means that the claim that “nothing really happened, we still have about 1000 PhDs” is not the whole truth. There is no other MSR-SV in MSR. There are of course other parts of MSR that share the MSR-SV philosophy, but they are now more isolated than before.

Empower Researchers and Engineers Alike

I encourage you to read Roy Levin’s paper on academic-style industrial labs. This is a time-tested formula which Roy, with his unique skills and vision and his partnership with Mike Schroeder, managed to perfect over the years . Microsoft’s action takes nothing off Roy’s success. See Roy’s addendum below giving due credit to Bob Taylor.

If I want to summarize the approach, I would do it in two words: empower researchers. Empower them to follow their curiosity and to think long term. Empower them to collaborate freely. Empower them to stay an active part of the scientific community. When brilliant researchers with a desire to impact have such freedom to innovate, then great things happen (as proven by MSR-SV).

On the other hand, to be fair, other companies seem to be much better than Microsoft in empowering engineers to innovate and explore. This is wonderful and I admire these companies for it. In fact, the impediment for even more impact by MSR-SV was not the lack of incentive for researchers to contribute (we were highly motivated), but rather the incentive structure of some of the product groups we interacted with in which innovation was not always sufficiently rewarded.

The Cycle of Industry Labs.

Different companies need different things out of their research organizations (and some are successful without any research organization to speak of). I have no shred of criticism of other models, as long as companies are honest about them when recruiting employees.

I would still argue that the success of MSR-SV is evidence that “Roy’s model” is extremely powerful. This model facilitated impact that would have been impossible in other models.

Some companies cannot afford such long term investment but other companies cannot afford not making such an investment. Indeed, in many of the companies I talked with there is a growing understanding of the need for more curiosity-driven long-term research.

I am reminded that when AT&T Research (of which I was a part) suffered brutal layoffs, people mourned the end of “academic-style research” in industry. This didn’t happen then and it will not happen now, simply since the need for this style of research exists.

Job Security is the Security to Find a New Job

Given the above, it should be clear that being applied or even having an impact does not guarantee job security in industry. I saw it in the collapse of AT&T research many years ago. People that did wonderful applied work were the first to go (once corporate decided to change its business model). Predicting the future in industry is impossible, and there are many examples. I do not trust the prophecies of experts (they are mainly correct in retrospect). I also think that industry employees should avoid the danger of trusting the internal PR. Even when the “internal stories” are the result of good intentions (rather than cynical manipulation), they do not hold water when the time comes. If I blame myself for anything, it is only for taking some of the MSR management statements at face value.

So what should industry employees do? First, decide if you can live with uncertainty. Then, make sure that your current job serves your next job search (whether in industry or academia). Don’t trust management to take care of you, nor should you wait for Karma to kick in. This in particular means that not every industry job serves every career path and that one should be vigilant in preserving external visibility – whether it is via publishing, open source, or just contribution to projects that are understood externally.

Academia’s Reaction

Finally one point about the open letter to Microsoft from a group of outstanding academics. This letter was not about the group of alumni MSR-SV employees. It is true that individuals whose career path is inconsistent with other industry jobs were put in a difficult position. But we will all land on our feet eventually, and we have no justification to indulge in self-pity. The letter was about the unwritten contract between academia and MSR which have arguably been broken. It was about understanding in which direction MSR is going, and accordingly what the new form of collaboration possible between academia and MSR can be. It was an attempt to start a discussion and it is a shame it was not answered more seriously.


Addendum by Roy Levin:

I want to add a small but important clarification to Omer’s post. The research environment of MSR Silicon Valley, which Mike Schroeder and I had the privilege of founding and managing, was inspired by Bob Taylor, for whom both of us worked at Xerox PARC and DEC SRC. The paper I wrote about research management, which Omer cited, describes how we applied Taylor’s deep understanding of research management in MSR Silicon Valley. (Indeed, my paper is chiefly an elaboration of a short paper Taylor co-authored in 2004.) Thus, MSR Silicon Valley was founded on proven models for corporate research, and they were not dramatically different from the broader MSR model that had been in place since Rick Rashid started MSR in 1991. Mike and I refined and reinterpreted what we had learned from Bob Taylor in previous labs (which Omer generously calls “perfecting” the model). Bob was the master, and we were his disciples.

Sanjeev Arora: Potential changes to STOC/FOCS: report from special FOCS session

November 11, 2014

As Boaz advertised, FOCS had a panel-led discussion on “How might FOCS and STOC evolve?” Here is a summary of that session by Sanjeev Arora:


This blog post is a report about a special 80 min session on the future shape of STOC/FOCS, organized by David Shmoys (IEEE TCMF Chair) and Paul Beame (ACM Sigact Chair) on the Saturday before FOCS in Philadelphia. Some 100+ people attended.

The panelists: Boaz Barak, Tim Roughgarden, and me. Joan Feigenbaum couldn’t attend but sent a long email that was read aloud by David. Avi Wigderson had to cancel last minute.

For those who don’t want to read further (spoiler alert): The panelists all agreed about the need to create an annual week-long event to be held during a convenient week in summer, which would hopefully attract a larger crowd than STOC/FOCS currently do. The decision was to study how to organize such an annual event, likely starting June 2017. Now read on.

Sole ground rule from David and Paul was: no discussion of open access/copyright, nor of moving STOC/FOCS out of ACM/IEEE. (Reason: these are orthogonal to the other issues and would derail the discussion.)

Boaz and Omer’s proposal in a nutshell (details are here): Fold STOC/FOCS into this annual event. Submissions and PC work for these two would work just as now with the same timetable. Actual presentations would happen at this annual event. But the annual event would be planned by a third PC that would decide upon how much time to allocate to each paper’s presentationnot all papers would be treated equally. This PC would also plan a multi-day program of plenary talks —invited speakers, and selected papers drawn from theory conferences of the past year including STOC/FOCS. (Some people expressed discomfort with creating different classes of STOC-FOCS papers. See Boaz and Omer’s blog post for more discussion, and also my proposal below.)

Tim’s ideas: It’s very beneficial to have such a mega event in some form. Logistics may be formidable and need discussing, but it would be good for the field to have a single clearing point for major results and place to catch up with others (for which it is important that the event is attractive enough to draw everybody). His other main point: the event should give a large number of people “something to do” by which he meant “something to present.” (Could be poster presentations, talks, workshops, etc.) This helps draw people into the event rather than make them feel like bystanders.

Joan’s email: Started off by saying that we should not be afraid of experimentation. Case in point: She tried a 2-tier PC a few years ago and while many people railed against it, nobody could pinpoint any impact on the quality of the final program. She thinks STOC/FOCS currently focus too much on technical wizardry. While this has its place, other aspects should be valued as well. With this preamble, her main proposal was: There should be an inclusive annual mega event that showcases good work in many different aspects of TCS , possibly trading off some mathematical depth with inclusiveness and intellectual breadth. Secondary proposal: to fix somehow the problem of incomplete papers. (She mentioned the VLDB model where the conference is also a journal.) Interestingly, I don’t detect such a crisis in TCS today; most people post full versions on arxiv. I do support looking at the VLDB model, but for a different reason: it’s our journal process that seems broken.

My proposal: Though it was a panel discussion I prepared powerpoint slides, which are available here. My proposal has evolved from my earlier blog post which turned into a B. EATCS article. A guiding principle: “Add rather than subtract; build upon structures that already work well.” The STOC/FOCS PC process works well with efficient reviewing and decision-making—though not everybody is happy with the decisions themselves. But the journal process is sclerotic and possibly broken, so proposals (such as Fortnow’s) that replace conferences with journals seems risky. Finally, let’s design any new system to maximize buy-in from our community.

So here’s the plan in brief: Keep STOC/FOCS as now, possibly increasing the number of acceptances to 100-ish, which still fit in 3 days with 2 parallel sessions but no plenary talks. (“If you are content with your current STOC/FOCS, you don’t need to change anything.”) Then add 3-4 days of activity around STOC including workshops, poster sessions, and lots of plenary sessions. Encourage other theory conferences to co-locate with this event.

See my article and slides for further details.

A Few Meta Points that I made.

Here are a few meta points that I made, which are interrelated:

We are a part of computer science. I hope to be a realist here, not controversial. Our work involves and touches upon other disciplines: math, economics, physics, sociology, statistics, biology, operations research, etc. But most of our students will find jobs in CS departments or industrial labs, and practically none in these allied disciplines. CS is also the field (biology possibly excepted) with most growth and new jobs in the foreseeable future. Our system should be most attuned with the CS way of doing things. To shoot down an obvious straw man, we should avoid the Math mode of splitting into small sub-communities and addressing papers and research to a small group of experts. Our papers and talks should remain comprehensible and interesting for a broad TCS audience, and a significant fraction of our collective work should look interesting to a general CS audience. (Joan’s email made a similar point about the danger of what she calls “mathematization.”)

Senior people in TCS have been dropping out of the STOC/FOCS system. I am, at 46 years of age, a regular attendee, but most people my age and older aren’t. I have talked to them, and they often feel that STOC/FOCS values specialization: technical improvements to past work, and that sort of thing. Any reform should try to address their concerns, and I hope the mega event will bring them back. (My advice to these senior people: if you want to change STOC/FOCS, be willing to serve on the PC, and speak up.)

Short papers are better. There’s a strong trend towards preferring long papers with full proofs. I consider this the “Math model” because it rewards research topics and presentation aimed at a handful of experts. I favor an old-fashioned approach that’s still in fashion at top journals like Nature and Science: force authors to explain their ideas in 8 double-column pages (or some other reasonable page limit). No appendices allowed, though reviewers who need more details should be able to look up a time-stamped detailed version on arxiv. In other words, use arxiv to the fullest, but force authors to also write clean, self-contained and terse versions. This is my partial answer to the question “What is the value added by conferences?” (NB: I don’t sense a crisis of incorrect papers in STOC/FOCS right now. Plus it’s not the end of the world if a couple percent of conference papers turn out to be wrong; Science and Nature have a worse track record and are doing OK!)

Towards the end of the session David and Paul solicited further ideas from the audience. Sensing general approval of the June mega event, they announced that they will further study this idea, and possibly implement it starting 2017, without waiting for other theory conferences to collocate. Paul pointed to logistical hurdles, which necessitate careful planning. David observed that putting the spotlight on STOC may cause FOCS to wither away. Personally, I think FOCS will do fine and may even find a devoted audience of those who prefer a more intimate event.

So dear readers, please comment away with your reactions and thoughts. This issue creates strong opinions, but let’s keep it civilized. If you have a counter proposal, please put it on the web and send us the link; Paul and David are following this debate.

ps: I am skeptical of the value of anonymous comments and will tend to ignore them (and hope that the other commenters will too).


FOCS 2014 is starting

October 18, 2014

Hope everyone has a great FOCS! In the previous post we mentioned the two workshops on different aspects of the Fourier transforms occurring today. I also wanted to mention the Tutorial on obfuscation today with talks by  Amit Sahai, Allison Lewko and Dan Boneh. The new constructions of obfuscation and their applications form one of the most exciting and rapidly developing research topics in cryptography (and all of theoretical CS) today, and this would be a great opportunity for non-specialists to catch up on some of these advances.

Applied mathematicians vs Theoretical Computer Scientists

October 12, 2014

[Guest post by Anna Gilbert, who is co-organizing with Piotr Indyk and Dina Katabi a FOCS 2014 workshop on The  Sparse Fourier Transform: Theory and Applications, this Saturday 9am-3:30pm]  After reading Boaz’s post on Updates from the ICM and in particular his discussion of interactions between the TCS and applied math communities, I thought I’d contribute a few observations from my interactions with both, as I consider myself someone who sits right at the intersection. My formal training is in (applied) mathematics and I am currently a faculty member in the Mathematics Department at the University of Michigan. I have spent many years working with TCS people on streaming algorithms and sparse analysis and I worked at AT&T Labs (where the algorithms group was much larger than the “math” group). There are definitely other TCS researchers who are quite adept and interested in collaborations with applied mathematicians, electrical engineers, computational biologists, etc. There are also venues where both communities come together and try to understand what each other is doing. The workshop that Piotr Indyk, Dina Katabi, and I are organizing at FOCS this year is a good example and I encourage anyone interested in learning more about these areas to come. The speakers span a range of areas from TCS, applied math, and electrical engineering! What’s especially fascinating is the juxtaposition of our workshop on the sparse Fourier transform with that of another FOCS workshop that day on Higher-order Fourier Analysis. There are two workshops on Fourier analysis, a topic that is central to applied and computational mathematics, at a conference ostensibly on the Foundations of Computer Science! Here are my observations of both communities (with a large bias towards examples in sparse approximation, compressed sensing, and streaming/sublinear algorithms):

1) Applied mathematicians are not nearly as mathematical as TCS researchers. By which I mean, the careful formal problem statements, the rigorous definitions, the proofs of correctness for an algorithm, the analysis of the use of resources, the definition of resources, etc. are not nearly as developed nor as important to applied mathematicians.

Here are two examples on the importance of clear, formal problem statements and the definition of resources. There a number of different ways to formulate sparse approximation problems, some instantiations are NP-complete and some are not. Some are amenable to convex relaxation and others aren’t. For example, exact sparse approximation of an arbitrary input vector over an arbitrary redundant dictionary is NP-complete but if we draw a dictionary at random and seek a sparse approximation of an arbitrary input vector, this problem is essentially the compressed sensing problem for which we do have efficient algorithms (for suitable distributions on random matrices). Stated this way, it’s clear to TCS what the difference is in the problem formulations but this is not the way many applied mathematicians think about these problems. To the credit of the TCS community, it recognized that randomness is a resource—generating the random matrix in the above example costs something and, should one want to design a compressed sensing hardware device, generating or instantiating that matrix ”in hardware” will cost you resources beyond simple storage. Pseudo-random number generators are a central part of TCS and yet, for many applied mathematicians, they are a small implementation detail easily handled by a function call. Similarly, electrical engineers well-versed in hardware design will use a linear feedback shift register (LFSR) to build such random matrices without making any ”use” of the theory of pseudo-random number generators. The gap between the mathematics of random matrices and the LFSR is precisely where pseudo-random number generators, small space constructions of pseudo-random variables, random variables with limited independence, etc. fit, but forming that bridge and, more importantly, convincing both sides that they need a bridge rather than a simple function call or a simple hardware circuit, is a hard thing to do and not one the TCS community has been successful at. (Perhaps it’s not even something they are aware of.)

2) Many TCS papers, whether they provide algorithms or discuss models of computation that could/should appeal to applied mathematicians, are written or communicated in a way that applied mathematicians can’t/don’t/won’t understand. And, sometimes, the problems that TCS folks address do not resonate with applied mathematicians because they are used to asking questions differently.

My biggest example here is sparse signal recovery as done by TCS versus compressed sensing. For TCS, it is very natural to ask to design both a measurement matrix and a decoding algorithm so that the algorithm returns a good approximation to the sparse representation of the measured signal. For mathematicians, it is much more natural to ask what conditions are sufficient (or, even better, necessary) for the measurement matrix and some existing algorithm (as opposed to one crafted specifically for the problem) to recover the sparse approximation. Applied mathematicians do not, in general, ask questions about how to generate such matrices algorithmically and how to compute with them, unless they are serious about implementing these algorithms and then, typically, these questions are software questions rather than mathematical ones. They are low-level, details, not necessarily abstract questions to be addressed formally. As an even higher level example of a difference in goals, the notion of approximation algorithm is foreign to applied mathematicians—that concept does not appear in numerical analysis. Typically, convergence rates or error analysis for numerical algorithms is expressed as a function of the step-size (for numerical integration, solving differential equations, etc.) or the number of iterations (for any iterative algorithm). It’s standard to seek a bound on the number of iterations one needs to guarantee an error (or relative error) of \epsilon rather than (1 +\epsilon) \cdot \mathrm{OPT}. The idea that for the given input, there is an optimal solution and we want our algorithm to return a solution that is close to that optimal is not a standard way of analyzing numerical algorithms. After all, that optimal solution may have terrible error and it’s not easy to determine what the optimal error is.

3) Finally, for many applied mathematicians, computation is a means to an end (e.g., solve the problem, better, faster) as opposed to an equal component of the mathematical problem, one to be studied rigorously for its own sake. And, for a number of TCS researchers, actually making progress on a complicated, real-world problem takes a back seat to the intricate mathematical analysis of the computation. In order for both communities to talk to one another, it helps to understand what matters to each of them.

I think that Michael Mitzenmacher’s response to Boaz’s post is similar to the points in the last point when he says “I think the larger issue is the slow but (over long periods) not really subtle shift of the TCS community away from algorithmic work and practical applications.” Although, I am not sure either model is better. TCS research can be practical, applied math isn’t as useful as we’d like to think, and solving a problem better, faster can be done only after thorough, deep understanding, the type that TCS excels at.


Get every new post delivered to your Inbox.

Join 299 other followers