Last summer I gave a mini-course on the Sum of Squares algorithm in the Swedish Summer School of Computer Science.

It was a great experience – the venue was  Djurönäset  – a hotel in the beautiful Stockholm archipelgo with stunning views and great food. It was organized very smoothly by Jakob Nordström, Per Austrin, and Johan Håstad, andthe students were bright and enthusiastic. I could see them collaborating on homework problems well into the night, and I’m sure that beyond learning on theoretical computer science, they made great new connections.

In fact, I believe the only downside to that summer school was my own course – this was the first time I taught  this material and so it felt very much like a beta version. (On the other hand, Ryan O’Donnell gave a fantastic lecture series on analysis of Boolean function, so the students didn’t suffer too much.) Luckily, this year the summer school features two absolute top experts on coding theory, Venkat Guruswami and Sergey Yekhanin, who will give courses on developments that are not yet as widely known in the theoretical computer science community as they should be, but have had wide reaching impact. Polar codes are considered one of the great recent breakthroughs in coding theory as they provide in some sense the first explicit codes that achieve Shannon’s non-constructive bounds for the capacity of the binary symmetric channel. Codes with local decoding are a prime example of the surprising interplay between theory and practice. Originally arising from probabilistically checkable proofs, they have recently found very significant industrial applications,  saving companies 100’s of millions of dollars.

I hear that there are still a few slots left. If you are a Ph.D student, Masters student, or postdoc who want to know more about coding theory, I can hardly think of a more pleasant and beneficial way to do so. If you are interested, please apply as soon as possible at http://s3cs.csc.kth.se/application.php . For any questions, the organizers can be contacted at  s3cs-2015@csc.kth.se.

See here.

NSF-sponsored papers should be freely available no more than 12 months after publication in a journal.

This is not perfect, but a step in the right direction. Computer scientists should insist that conference proceedings are treated the same way, and made freely available no more than 12 months after publication.

Hat tip: Lance Fortnow.

There are only two FOCS/STOC chairs per year, and most would do just fine without my unsolicited advice. Nevertheless, I thought it might make sense to write down some of my thoughts after chairing FOCS 2014, and perhaps even some people that are not program chairs will find it of interest.

This is based on advice I received from many people, some of which are mentioned by name but most aren’t. However the opinions here are my own. Others’ opinions (including those of the people mentioned) may vary: there is more than one way to run this process. Despite the length of this document, there are still many points and details that I omitted. If you are going to be a program chair, feel free to contact me for more advice, examples of forms, documents, and my (badly written and comment-free) code for scripts and website. You can also ask questions in the comments below.

Many discussions on conferences focus on issues of process – one-tier vs two-tier PC, anonymous submissions, page limits, conflicts of interest rules etc… These are all important issues and as you’ll see below I do have my own opinions on them. But I believe that to a first approximation the only thing that matters is the work that is put in collectively by the chair, PC, and reviewers. If the papers get reviewed by the right people that spend sufficient time to do a quality job then the results will probably be fine no matter what process you use; otherwise no process will save you. To a large degree, whether this happens is controlled by the amount of effort you put in as chair in first selecting the right people and then constantly staying on top of the process making sure that no paper falls between the cracks.

There are two common arguments why things will turn out fine even if you don’t work as hard as you could. I disagree with them both:

“No matter what, the good papers will get in, the bad papers will be rejected, and the only job of the PC is to figure out the decisions for the papers in the middle.”

This is patently false as we all know famous examples of great papers that were nonetheless rejected from conferences. There is no reason to think that such colossal mistakes were unavoidable and could not have been prevented by the PC working harder. For every one of those famous examples, there are many other “near misses” where the PC made the right decision only because of the diligence and persistence of one person.

“Even if you reject a great paper, it’s not so bad, since the author can always submit it to the next conference.”

Sure the author would do fine, but this is not the point. You don’t work for the authors but for the community and it’s not the conference that honors the paper – it is the paper honors the conference. So when a great paper is omitted from the program it is the conference, and by extension the community, that suffers. This is true even if the paper appears in the next conference but there are also other cases: the paper might be rejected again, or (increasingly likely these days) the author might submit the paper to a specialized conference, and so it might take much more time for the work to get known by the general theoretical CS community.

With general ruminations out of the way, here are some concrete tips in chronological order, from the time you are asked to be a program chair to the actual conference.

In an undergraduate algorithms class we learn that an algorithm is a high level way to describe a computer program. The running time of the algorithm is the number of operations it takes on inputs of a particular size- the smaller the better. So, as even Barack Obama knows, if you implement Quick-Sort, with its running time of ${O(n\log n)}$, it would run faster than the ${O(n^2)}$-time algorithm Bubble-Sort. However, if you pick up a more recent algorithms paper, things become a little murkier. Many papers feature algorithms with running times such as ${O(n^6)}$ or even ${O(n^{20})}$, and even some linear-time algorithms have hidden constants that make you want to run to your bed and hide under the covers. Asymptotic analysis does become more applicable as technology improves and our inputs grow but some people may still question the practical relevance of, say, an ${O(n^{10})}$-time algorithm, which seems no more feasible than the trivial ${2^n}$-time brute force algorithm.

So why do we care about getting asymptotically good algorithms? Every TCS graduate student should be able to recite the “party line” answer that it has happened again and again that once a problem has been shown to be in polynomial time, people managed to come up with algorithms with small exponents (e.g. quadratic) and reasonable leading constants. There is certainly some truth to that (see for example the case of the Ellipsoid and Interior Point algorithms for linear programming) but is there an underlying reason for this pattern, or is it simply a collection of historical accidents?

Paul Erdős envisioned that “God” holds a book with the most elegant proof for every mathematical theorem. In a famous quote he said that “you don’t have to believe in God, but you have to believe in the Book”. Similarly, one can envision a book with the “best” (most efficient, elegant, “right”) algorithms for every computational problem. The ${2^n}$-time brute force algorithm is in this book, and the essence of the ${\mathbf{P}\neq\mathbf{NP}}$ conjecture is that this exhaustive search algorithm is in fact the best algorithm for some problems in ${\mathbf{NP}}$. When a researcher shows a cumbersome and impractical ${O(n^{20})}$-time algorithm $A$ for a problem ${X}$, the point is not that you should code up $A$ and use it to solve the problem ${X}$ in practice. Rather, the point is that $A$‘s existence proves that  ${X}$ is not one of those problems for which brute force is optimal. This is a very significant piece of information, since it tells us that the “Book algorithm” for ${X}$ will be much faster than brute force, and indeed once a non-trivial algorithm is given for a problem, further improvements often follow as researchers get closer and closer to the true “Book algorithm” for it. We’ve seen enough of the Book to know that not all algorithms in it are either ${O(n^2)}$ or ${2^{\Omega(n)}}$ time, but there should still be an inherent reason for the “Book algorithm” to have a non-trivial exponent. If the best algorithm for some problem ${X}$ is, say, ${n^{\log n}}$-time, then by reading that page of the Book we should understand why this is the right answer (possibly because solving ${X}$ boils down to running the brute force algorithm on some smaller domain).

The same intuition holds for hardness results as well. The famous PCP theorem can be thought of as a reduction from the task of exactly solving the problem SAT to the task of solving it approximately. But if you apply the original PCP reduction to the smallest known hard instances of SAT (which have several hundred variables) the resulting instance would be too large to fit in even the NSA’s computers. What is the meaning of such a reduction? This issue also arises in cryptography. As some people like to point out, often when you reduce the security of a cryptographic scheme $Y$ to the hardness of some problem $X$, the reduction only guarantees security for key sizes that are much larger than what we’d like to use in practice. So why do we still care about such reductions? The answer is that by ruling out algorithms that run much faster than the natural exponential-time one, the reduction gives at least some evidence that this particular approximation or crypto-breaking problem might actually one of those for which the exponential-time algorithm is in fact the “Book algorithm”. And indeed, just like the case for algorithms, with time people have often been able to come up with increasingly more efficient reductions.

Another way to think of this is that the goal of a theory paper is not to solve a particular problem, but to illustrate an idea- discover a “paragraph from the book” if you will. The main reason we care about measures such as asymptotic running time or approximation factor is not due to their own importance but because they serve to “keep us honest” and provide targets that we won’t be able to reach without generating new ideas that can later be useful in many other contexts.

Erdős’s quote holds true for theoretical computer scientists as well. Just as physicists need to believe in a “book” of the laws of nature to make inferences based on past observations, the Book of algorithms is what makes TCS more than a loose collection of unrelated facts. (For example, this is why in his wonderful essay, Russell Impagliazzo was right to neglect some “non-Book” scenarios such as $\mathbf{P}=\mathbf{NP}$ with an $n^{100}$-time algorithm.) Ours is a young science, and we only know very few snippets from the Book. I believe that eventually, even statements such as ${\mathbf{P}\neq\mathbf{NP}}$ would be seen as mere corollaries of much grander computational “laws of nature”.

[Updated 2/25 with some minor rephrasing]

And available at this web page. At some point I would like to add a direct link to the video for each paper from the program page, but I figured that it’s best to announce this now rather than let the perfect be the enemy of the good.

Update: Paul Beame notes below that the videos, as well as other content for this and previous FOCS’s are available on  http://ieee-focs.org

[One can tell it’s reviewing and letter-writing season when I escape to blogging more often..]

There’s been some discussion on the NIPS experiment, enough of it that even my neuro-scientist brother sent me a link to Eric Price’s blog post. The gist of it is that the program chairs duplicated the reviewing process for 10% of the papers, to see how many would get inconsistent decisions, and it turned out that 25.9% of them did (one of the program chairs predicted that it would be 20% and the other that it would be 25%, see also herehere and here). Eric argues that the right way to measure disagreement is to look at the fraction of papers that process A accepted that would have been rejected by process B, which comes out to more than 50%.

It’s hard for me to interpret this number. One interpretation is that it’s a failure of the refereeing process that people can’t agree on more than half of the list of accepted papers. Another viewpoint is that since the disagreement is not much larger than predicted beforehand, we shouldn’t be that surprised about it. It’s tempting when having such discussions to model papers as having some inherent quality score, where the goal of the program committee is to find all papers above a certain threshold. The truth is that different papers have different, incomparable qualities, that appeal to different subsets of people. The goal of the program committee is to curate an a diverse and intellectually stimulating program for the conference. This is an inherently subjective task, and it’s not surprising that different committees would arrive at different conclusions. I do not know what’s the “optimal” amount of variance in this process, but I would have been quite worried if it was zero, since it would be a clear sign of groupthink. Lastly, I think this experiment actually points out to an important benefit of the conference system. Unlike journals, where the editorial board tends to stay constant for a long period, in conferences one gets a fresh draw of the committee every 6 months or a year.

Last Friday in our theory reading group, Yael Kalai observed that there’s only one other woman in the room. She noticed it because in cryptography meetings, at least in the Boston area, there is a significantly higher female presence. Make no mistake, cryptography, even in Boston, still has a very lopsided gender ratio. But I think it is still a bit better than some of the other areas of theoretical computer science, largely due to a few strong role models such as Shafi Goldwasser. The gender ratio in TCS is sometimes thought of as an unfortunate but constant fact of life, that is due to larger forces in society beyond our control. Examples such as this show that actions by small communities or even individuals can make a difference.

I truly believe that theoretical computer science, and computer science at large, will grow significantly in importance over the coming decades, as it occupies a much more central place in many of the sciences and other human activities. We’ll have many more problems to solve, and we can’t do it without using half of the world’s talent pool.