STOC Theory Fest 2017 (Montreal June 19-23)
Sanjeev Arora, Paul Beame, Avrim Blum, Ryan Williams
SIGACT Chair Michael Mitzenmacher announced at the STOC’16 business meeting that starting in 2017, STOC will turn into a 5-day event, a Theory Fest. This idea was discussed at some length in a special session at FOCS 2014 and the business meeting at STOC 2015. Now the event is being planned by a small group (Sanjeev Arora, SIGACT ex-chair Paul Beame, Avrim Blum, and Ryan Williams; we also get guidance from Michael Mitzenmacher and STOC’17 PC chair Valerie King). We’re setting up committees to oversee various aspects of the event.
Here are the major changes (caveat: subject to tweaking in coming years):
(i) STOC talks go into 3 parallel sessions instead of two. Slight increase in number of accepts to 100-110.
(ii) STOC papers also required to be presented in evening poster sessions (beer/snacks served).
(iii) About 9 hours of plenary sessions, which will include: (a) Three keynote 50-minute talks (usually prominent researchers from theory and nearby fields) (b) Plenary 20-minute talks selected from the STOC program by the STOC PC —best papers, and a few others. (c) Plenary 20-minute talks on notable papers from the broader theory world in the past year (including but not limited to FOCS, ICALP, SODA, CRYPTO, QIP, COMPLEXITY, SoCG, COLT, PODC, SPAA, KDD, SIGMOD/PODS, SIGMETRICS, WWW, ICML/NIPS), selected by a committee from a pool of nominations. (Many nominees may be invited instead to the poster session.)
(iv) 2-hour tutorials (three in parallel).
(v) Some community-building activities, including grad student activities, networking, career advice, funding, recruiting, etc.
(vi) A day of workshops; 3 running in parallel. (Total of 18 workshop-hours.)
Our hope is that workshop day(s) will over time develop into a separate eco-system of regular meetings and co-located conferences (short or long). In many other CS fields the workshop days generate as much energy as the main conference, and showcase innovative, edgy work.
Poster sessions have been largely missing at STOC, but they have advantages: (a) Attendees can quickly get a good idea of all the work presented at the conference (b) Grads and young researchers get more opportunities to present their work and to interact/network, fueled by beer and snacks. (c)Attendees get an easy way to make up for having missed a talk during the day, or to ask followup questions. (d) Posters on work from other theory conferences broadens the intellectual scope of STOC,
We invite other theory conferences to consider co-locating with the Theory Fest. To allow such coordination, in future the location/dates for the Theory Fest will be announced at least 18 months in advance, preferably 2 years. Even for 2017 it is not too late yet.
Finally, we see the Theory Fest as a work in progress. Feedback from attendees will be actively sought and used to refashion the event.
Happy families are all alike; every unhappy family is unhappy in its own way.
I am talking about the work Reed-Muller Codes Achieve Capacity on Erasure Channels by Shrinivas Kudekar, Santhosh Kumar, Marco Mondelli, Henry D. Pfister, Eren Sasoglu and Rudiger Urbanke. We are used to thinking of some error correcting codes as being “better” than others in the sense that they have fewer decoding errors. But it turns out that in some sense all codes of a given rate have the same average number of errors. The only difference is that “bad” codes (such as the repetition code), have a fairly “smooth” error profile in the sense that the probability of decoding success decays essentially like a low degree polynomial with the fraction of errors, while for “good” codes the decay is like a step function, where one can succeed with probability when the error is smaller than some but this probability quickly decays to half when the error passes .
Specifically, if is a linear code of dimension and , we let be the random variable over that is obtained by sampling a random codeword in and erasing (i.e., replacing it with ) every coordinate independently with probability . Then we define to be the average over all of the conditional entropy of given . Note that for linear codes, the coordinate is either completely fixed by or it is a completely uniform bit, and hence can be thought of as the expected number of the coordinates that we won’t be able to decode with probability better than half from a -sized random subset of the remaining coordinates.
One formalization of this notion that all codes have the same average number of errors is known as the Area Law for EXIT functions which states that for every code of dimension , the integral is a fixed constant independent of . In particular note that if is the simple “repetition code” where we simply repeat every symbol times, then the probability we can’t decode some coordinate from the remaining ones (in which case the entropy is one) is exactly where is the erasure probability. Hence in this case we can easily compute the integral which is simply one minus the rate of the code. In particular this tells us that the average entropy is always equal to the rate of the code. A code is said to be capacity achieving if there is some function that goes to zero with such that whenever . The area law immediately implies that in this case it must be that is close to one when (since otherwise the total integral would be smaller than ), and hence a code is capacity achieving if and only if the function has a threshold behavior. (See figure below).
The paper above uses this observation to show that the Reed Muller code is capacity achieving for this binary erasure channel. The only property they use is the symmetry of this code which means that for this code we might as well have defined with some fixed coordinate (e.g., the first one). In this case, using linearity, we can see that for every erasure pattern on the coordinates the entropy of given is a Boolean monotone function of . (Booleanity follows because in a linear subspace the entropy of the remaining coordinate is either zero or one; monotonicity follows because in the erasure channel erasing more coordinates cannot help you decode.) One can then use the papers of Friedgut or Friedgut-Kalai to establish such a property. (The Reed-Muller code has an additional stronger property of double transitivity which allows to deduce that one can decode not just most coordinates but all coordinates with high probability when the fraction of errors is a smaller than the capacity.)
How do you prove this area law? The idea is simple. Because of linearity, we can think of the following setting: suppose we have the all zero codeword and we permute its coordinates randomly and reveal the first of them. Then the probability that the coordinate is determined to be zero as well is . Another way to say it is that if we permute the columns of the generating matrix of randomly, then the probability that the column is independent from the first columns is . In other words, if we keep track of the rank of the first columns, then at step the probability that the rank will increase by one is , but since we know that the rank of all columns is , it follows that , which is what we wanted to prove. QED
p.s. Thanks to Yuval Wigderson, whose senior thesis is a good source for these questions.
Attendance is free but registration is required. Also there are funds for travel support for students for which you should apply before August 1st.
Confirmed speakers are:
Incorporating differential privacy broadly into Apple’s technology is visionary, and positions Apple as the clear privacy leader among technology companies today.
Learning more about the underlying technology would benefit the research community and assure the public of validity of these statements. (We, at Research at Google, are trying to adhere to the highest standards of transparency by releasing Chrome’s front-end and back-end for differentially private telemetry.)
I am confident this moment will come. For now, our heartfelt congratulations to everyone, inside and outside Apple, whose work made today’s announcement possible!
Yesterday Hillary Clinton became the first woman to be (presumptively) nominated for president by a major party. But in the eyes of many, the Republican Party was first to make history this election season by breaking the “qualifications ceiling” (or perhaps floor) in their own (presumptive) nomination.
Though already predicted in 2000 by the Simpsons , the possibility of a Trump presidency has rattled enough people so that even mostly technical bloggers such as Terry Tao and Scott Aaronson felt compelled to voice their opinion.
We too have been itching for a while to weigh in and share our opinions and to use every tool in our disposal for that, including this blog. We certainly think it’s very appropriate for scientists to be involved citizens and speak up about their views. But though we debated it, we felt that this being a group (technical) blog, it’s best not to wage into politics (as long as it doesn’t directly touch on issues related to computer science such as the Apple vs. FBI case). Hence we will refrain from future postings about the presidential election. For full disclosure, both of us personally support Hillary Clinton and have been donating to her campaign.
Among other things, Bobby showed the proof of the following result, that demonstrates much of those ideas:
Theorem: (Ellenberg and Gijswijt, building on Croot-Lev-Pach) There exists an absolute constant such that for every , if then contains a 3-term arithmetic progression.
To put this in perspective, up till a few weeks ago, the best bounds were of the form and were shown using fairly complicated proofs, and it was very reasonable to assume that a bound of the form is the best we can do. Indeed, an old construction of Behrend shows that this is the case in other groups such as the integers modulo some large or where is some large value depending on . The proof generalizes to for every constant prime (and for composite order cyclic groups as well).
The proof is extremely simple. It seems to me that it can be summarized to two observations:
Let’s now show the proof. Assume towards a contradiction that satisfies ( can be some sufficiently small constant, will do) but there do not exist three distinct points that form a -a.p. (i.e., such that or, equivalently, ).
Let be the number of -variate monomials over where each variable has individual degree at most (higher degree can be ignored modulo ) and the total degree is at most . Note that there are possible monomials where each degree is at most two, and their degree ranges from to , where by concentration of measure most of them have degree roughly . Indeed, using the Chernoff bound we can see that if is a sufficiently small constant, we can pick some such that if then but (to get optimal results, one sets to be roughly and derives from this value).
Now, if we choose in that manner, then we can find a polynomial of degree at most that vanishes on but is non zero on at least points. Indeed, finding such a polynomial amounts to solving a set of linear equations in variables.^{1} Define the matrix such that . Since the assumption that implies that , the theorem follows immediately from the following two claims:
Claim 1: .
Claim 2: .
Claim 1 is fairly immediate. Since is -a.p. free, for every , is not in and hence is zeros on all the off diagonal elements. On the other hand, by the way we chose it, has at least nonzeroes on the diagonal.
For Claim 2, we expand as a polynomial of degree in the two variables and , and write where corresponds to the part of this polynomial where the degree in is at most and corresponds to the part where the degree in is larger and hence the degree in is at most . We claim that both and are at most . Indeed, we can write as for some coefficients and polynomials , where indexes the monomials in of degree at most . But this shows that is a sum of at most rank one matrices and hence . The same reasoning shows that thus completing the proof of Claim 2 and the theorem itself.
More formally, we can argue that the set of degree polynomials that vanish on has dimension at least and hence it contains a polynomial with at least this number of nonzero values.↩
Northwestern University held a workshop on semidefinite programming hierarchies and sum of squares. Videos of the talks by Prasad Raghavendra, David Steurer and myself are available from the link above. The content to unicorns ratio in Prasad and David’s talks is much higher ☺
If you want to celebrate towel day in Cambridge, bring your towel to Tselil Schramm’s talk on refuting random constraint satisfaction problems using Sums of Squares in MIT’s algorithms and complexity seminar (4pm, room 32-G575).
Here are some of the tools I use. This is obviously a very biased list and reflects my limitations as a Windows user that is too stupid to learn to use emacs and vim. Please share your better tips in the comments:
LaTeX editor / collaboration platform: I recently discovered Overleaf which can be described as “Google Docs” for LaTeX. It’s a web-based LaTeX editor that supports several people editing the same document simultaneously, so it’s great for multi-author projects, especially for those last days before the deadline where everyone is editing at once. There are few things as satisfying as watching improvements and additions being added to your paper as you’re reading it. One of the features I like most about it is its git integration. This means that you can also work offline on your favorite editor and pull and push changes from/to the overleaf repository. It also means that authors that don’t want to use overleaf (but can use git) can still easily collaborate with those that do. I actually like the editor enough that I’ve even used it for standalone papers.
Version control: I’ve mentioned git above, and this is the version control I currently use. I found source tree to be an easy GUI to work with git, and (before I switched to overleaf) bitbucket to be a good place to host a git repository. Git can sound intimidating but it takes 5 minutes to learn if someone who knows it explains it to you. The most important thing is to realize that commit and push are two separate commands. The former updates the repository that is on your local machine and the latter synchronizes those changes with the remote repository. The typical workflow is that you first pull updates, then make your edits, then commit and push them. As long as you do this frequently enough you should not have serious conflicts. I am also committing the cardinal sin of putting my git repositories inside my dropbox folder. There are a number of reason why it’s a bad idea, but I find it too convenient to stop. To make this not blow up, I never use the same folder from two different computers and hence have subfolders “Laptop” and “Desktop” for the repositories used by these computers respectively. Update 4/29: Clement Canonne mentions in a comment the Gitobox project that synchronizes a dropbox folder and a git repository and so allows easier collaboration with your non-git-literate colleagues.
Markdown: I recently discovered markdown and particularly its pandoc flavored variety as a quick and easy way to write any technical document – lecture notes, homework assignments, blog posts, technical emails – that is not an actual paper. It’s just much more lightweight than LaTeX and so you type things faster, but it can still handle LaTeX math and (using pdflatex) compile to both html and pdf. All the lecture notes for my crypto course were written in markdown and compiled to both html and pdf using pandoc.
Editor: I was a long time user of winedt but partially because of markdown and other formats, I decided to switch to a more general purpose editor that is not as latex centric. I am currently mostly using the Atom editor that I feel is the “editor of the future” in two senses. First, it is open source, backed by a successful company (github) and has a vibrant community working on extending it. Second, it’s the editor of the future in the sense that it doesn’t work so well in the present, and it sometimes hangs or crashes. If you prefer the “editor of the present” then sublime text might be for you. Some Atom packages I use include Build, Emmet, latextools, language-latex, language-pfm, Markdown Preview Plus, Mathjax-wrapper, pdf-view, preview-inline, sync-settings (I find it also crucial to enable autosave ).
Remote collaboration: I use several tools to collaborate with people remotely. I’ve used slack as a way to maintain a long-running discussion relating to a project. It works much better than endless email threads. In a technical discussion we’ll sometimes open a google hangouts video chat and in addition use appear to share a screen of a OneNote pad (assuming one or both of the discussants has a pen-enabled computer such as the Microsoft Surface Pro or Surface Book that seem to become the theorists’ new favorite these days). A good quality camera aimed at the whiteboard also works quite well in my experience.
Presentation: As I’ve written before, Powerpoint has awesome support for math in presentations. One thing which I would love to have – a visual basic script that goes over all my slides and changes all the math in them to a certain color. It’s a pain to do this manually.
Bibliography maintenance: Here is where I could use some advice. It seems that in every paper I end up spending the last few hours before the deadline scouring DBLP and Google Scholar for bib items and copying/pasting/formatting them. I wish there was an automatic script that would scan my tex sources for things like \ref{Goldwasser-Micali-Rackoff??} and return a bibtex file containing all of the best matches from DBLP/Google Scholar. Bonus points if it can recognize both the conference and journal version and format a bibtex which cites the journal version but adds a note “Preliminary version in STOC ’85”. Update 4/29: A commenter mentions the CryptoBib project that maintains a super-high-quality bibtex file of all crypto conferences and some theory conferences such as ICALP, SODA, STOC, FOCS. Would be great if this was expanded to all theory conferences.
Note taking: I am still a fan of a yellow pad with pen, but I find myself using OneNote quite often these days on my surface book (which is also the “computer of the future” in a sense quite similar to Atom..). It’s useful to take notes in talks, and also to write notes for myself before teaching a class so that they would be available for me the next time I teach it.
p.s. Thanks to David Steurer who I learned many of these tools from, though he shares no responsibility for my failure to learn how to use emacs.
Also, the preliminary program is available at http://highlightsofalgorithms.org/program/.
The program is packed with 28 invited talks and with even a larger number of short contributions.
Those interested in attending the conference are advised to book accommodation as early as possible (given the high hotel prices in Paris).