Skip to content

#metootcs

February 15, 2018

In an earlier post I asked if we have TCS “Harvey Weinsteins”. Unfortunately academia is hardly immune from sexual harassment and now a  TCS researcher posted about her experiences with sexual harassment and assault in our community. While this is not pleasant reading, it is important, and I urge you to read the full post (see also here).

It takes courage to talk about such painful experiences, even anonymously, in a community as small as ours. But this researcher deserves our thanks for bringing up this topic, and hopefully starting a conversation that would make theoretical computer science more welcoming and inclusive. I do not know who this person is, but I urge people not try to guess her identity, but rather focus on asking what we can do, both men and women, to make things better.

The blog post already contains some good suggestions. As the overwhelming majority in our field, we men enjoy many structural advantages, and it is especially up to us to step up to this challenge. Research is not a 9 to 5 job: conferences, workshops, and informal interactions are extremely important. But we should remember that we are there for the sake of scientific collaboration. A woman shouldn’t have to worry about the motivations behind every invitation for a discussion or meeting.

Given our skewed gender ratio, it is enough for a small minority of the men to behave badly to essentially guarantee that almost all women will encounter such behavior at some point. That is unless other men step up and call out unprofessional (or worse) behavior when we observe or are made aware of it. I know this is not easy – we are not selected for our ability to handle awkward social situations, and I can’t say I myself have  stepped up or been very aware of such issues in the past. But I will try to do better, and hope others do too.

Looking for car keys under the streetlight

February 12, 2018

In NIPS 2017, Ali Rahimi and Ben Recht won the test of time award for their paper “Random Features for Large-scale Kernel Machines”. Ali delivered the following acceptance speech (see also addendum) in which he said that Machine Learning has become “alchemy” in the sense that it involves more and more “tricks” or “hacks” that work well in practice, but are very poorly understood. (Apparently alchemists were also successful in making many significant practical  discoveries.) Similarly, when I teach cryptography I often compare the state of “pre modern” cryptography  (before Shannon and Diffie-Hellman) to alchemy.

Yann LeCun was not impressed with the speech, saying that sticking to using methods for which we have theoretical understanding is “akin to looking for your lost car keys under the street light knowing you lost them someplace else.” There is a sense in which LeCun is very right. For example, already in the seminal paper in which Jack Edmonds defined the notion of  polynomial time he said that “it would be unfortunate for any rigid criterion to inhibit the practical development of algorithms which are either not known or known not to conform nicely to the criterion.” But I do want to say something in defense of “looking under the streetlight”. When we want to understand the terrain, rather than achieve some practical goal, it can make a lot of sense to start in the simplest regime (e.g. “most visible” or “well lit”) and then expand our understanding (e.g., “shine new lights”). Heck, it may well be that when the super intelligent robots are here, then they would look for their keys by first making observations under the light and then extrapolating to the unlit area.

On double blind reviews in theory conferences

January 11, 2018

Michael Mitzenmacher points to two  posts of Suresh Venkatasubramanian on the issue of so called “double blind reviews” (i.e., anonymous submissions) in theory conferences. In short, both Michael and Suresh think they are a good idea. I agree with much of their motivations, but, based on my experience in both non-blinded (e.g., STOC/FOCS) and blinded (e.g., CRYPTO) conferences as both reviewer and author, I do not think double blind reviewing is a good fit for theoretical computer science.

Let me say right off the bat that I think implicit (and, as Michael says, sometimes explicit) bias is a very real phenomenon. Moreover, such biases are not just a problem in the  sense that they are “unfair” to authors, but they cause real harm to science, in suppressing the contributions from certain authors.  Nor do I have any principled objection to  anonymization: I do for example practice anonymous grading in my courses for exactly this reason. I also don’t buy the suggestion that we must know the author’s identity to evaluate if the proof is correct. Reviewers can (and do) evaluate whether a proof makes sense without needing to trust the author.

However, there is a huge difference between grading a problem set and refereeing a paper. In the latter case, and in particular in theoretical computer science, you often need the expertise of very particular people that have worked on this area. By the time the paper is submitted to a conference, these experts have often already seen it, either because it was posted on the arxiv/eccc/eprint, or because they have seen a talk on it, or perhaps they have already discussed it with the authors by email.

More generally, these days much of theoretical CS is moving to the model where papers are first posted online, and by the time they are submitted to a conference they have circulated quite a bit around the relevant experts. Posting papers online is very good for science and should be encouraged, as it allows fast dissemination of results, but it does make the anonymous submission model obsolete.

One could say that if the author’s identity is revealed then there is no harm, since in such a case we simply revert to the original form of non anonymous submissions. However, the fact that the authors’ identity is known to some but not all participants in the process  (e.g., maybe some reviewers but not others), makes some conflicts and biases invisible. Moreover, the fact that the author’s identity is not “officially” known, causes a lot of practical headaches.

For example, as a PC member you can’t just shoot a quick email to an expert to ask for a quick opinion on the paper, since they may well be the author themselves (as happened to me several time as a CRYPTO PC member), or someone closely related to them. Second, you often have the case where the reviewer knows who the authors are, and has some history with them, even if it’s not a formal conflict, but the program committee member does not know this information.  In particular, using anonymous submissions completely precludes using a disclosure based model for conflicts of interest (where reviewers disclose their relations with the authors in their reviews) but rather you have to move to an exclusion based model, where reviewers meeting some explicit criteria are ruled out.

If anonymous submissions don’t work well for theory conferences, does it mean we have to just have to accept biases? I don’t think so. I believe there are a number of things we could attempt. First, while completely anonymizing submissions might not work well, we could try to make the author names less prominent, for example by having them in the last page of the submissions instead of the first, and not showing them in the conference software. Also, we could try “fairness through awareness”. As I mentioned in my tips for future FOCS/STOC chairs,  one potential approach is to tag papers by authors who never had a prior STOC/FOCS paper (one could possibly also tag papers by authors from under-represented groups). One wouldn’t give such papers preferential treatment, but rather just make sure they get extra attention. For example, we could add an extra review for such papers. That review might end up being positive or negative, but would counter the bias of dismissing some works out of hand.

To summarize, I agree with Michael’s and Suresh’s sentiments that biases are harmful and should be combated. I just don’t think anonymous submissions are the way to go about that.

Unique Games Conjecture – halfway there?

January 10, 2018

(Edit: scribe notes on my lectures on this topic are now up.)

Subhash Khot, Dor Minzer and Muli Safra just posted an exciting manuscript online. In it, they confirm the combinatorial hypothesis I’ve posted about before on the structure of non-expanding set in the degree two short-code graph (or, equivalently, in the Grassman graph).   Together with their prior work with Dinur and Kindler,  and a small assist of an expansion-to-testing reduction of Kothari, Steurer and I, this establishes (an imperfect completeness variant of) Khot’s two to one conjecture and the intermediate complexity conjecture.   This also makes significant progress towards proving, or at least giving evidence for, the more famous unique games conjecture (UGC). In fact, as I explain below, there is a technical sense in which it goes “halfway” towards proving the UGC.

The Unique Games (UG(s,c)) problem with parameters 0 \leq s \leq c \leq 1 is the following: given a set of  linear equations each involving at most two variables over some finite field, to distinguish between the completeness case where there exists an assignment to the variables satisfying at least c fraction of the equations, and the soundness case where every assignment satisfies fewer than a s fraction.

Clearly the UG(s,c) problem  becomes easier the larger the gap between c and s. The unique games conjecture is that the problem is as hard as it can be, in the sense that it is NP hard for c arbitrarily close to one and s arbitrarily close to zero (as a function of the field size which we assume tends to infinity in what follows).  In other words, the difficulty of UG(s,c) as a function of c and s is conjectured by the UGC to look like this:

ugc_fig1

 

(Note that the problem does not make sense when c<s. Also, in this figure we ignore regions of measure zero. In particular for c = 1-o(1)  or s=o(1) the problem can be solved by a semidefinite program where the precise o(1) factors depend on the field size; it is known that if the UGC is true then this dependence is optimal.)

 

Until today, what was known about  unique games could be summarized in this (not to scale) figure:

UGC_fig2

That is, when s and c are sufficiently close to each other,  UG(s,c) was known to be NP hard (see for example this paper) and in fact with a linear-blowup reduction establishing exponential hardness (i.e. 2^{\tilde{\Omega}(n)} under the exponential time hypothesis. On the other hand, when either the completeness parameter is sufficiently close to one or the soundness is sufficiently close to zero, there was a known subexponential time algorithm  of Arora, Steurer and I (see also here) for UG(s,c). (That is, an algorithm  running in time 2^{n^\xi} for some \xi<1  which tends to zero as either completeness tends to one or soundness tends to zero.)

However, that algorithm was of course only an upper bound, and we did not know whether it could be improved further to 2^{n^{o(1)}} or even polynomial time. Moreover, its mere existence showed that in some sense the techniques of the previously known NP hardness results for UG(s,c) (which used a linear blow up reduction) are inherently inapplicable to establishing the UGC which requires hardness in a completely different regime.

The new result of Khot, Minzer and Safra (when combined with the prior ones) shows that UG(s,c) is NP hard for c arbitrarily close to half and s arbitrarily close to zero. (The result is presented as hardness of 2 to 2 games with completeness close to one, but immediately implies hardness of unique games with completeness close to half.) That is, the new picture of unique games’ complexity is as follows:

ugc_fig3

This establishes for the first time hardness of unique games in the regime for which a sub-exponential time algorithm was known, and hence (necessarily) uses a reduction with some (large) polynomial blowup. While it is theoretically still possible for the unique games conjecture to be false (as I personally believed would be the case until this latest sequence of results) the most likely scenario is now that the UGC is true, and the complexity of the UG(s,c) problem looks something like the following:

ugc_summary

 

That is, for every 0<s<c<1, the best running time is roughly 2^{n^{\xi(c,s)}} where \xi:(0,1)^2 \rightarrow (0,1] is a function that is always positive but tends to zero as c tends to one or s tends to zero (and achieves the value one in a positive measure region of the plane). Of course we are still yet far from proving this, let alone characterizing this function, but this is still very exciting progress nonetheless.

I personally am also deeply interested in the question of whether the algorithm that captures this curve is the sum of squares algorithm. Since SoS does capture the known subexponential algorithms for unique games, the new work provides more evidence that this is the case.

 

 

 

Intro TCS course post-mortem

January 3, 2018

This fall I taught CS 121 – “Introduction to Theoretical Computer Science” – at Harvard. This is analogous to courses known at other universities as “Introduction to the Theory of Computation”, “Automata and Computability”, or “Great Ideas in Theoretical Computer Science”, and are often taught using Sipser’s excellent book. However, I decided to significantly revise it and write my own text (which is still work in progress but available online with the source on a GitHub repository).

CS 121 is a required course for Computer Science concentrators, and so we had about 160 students. There was a great variability in students preparation and background, many of which have not taken a proof-based course before. That, combined with the inevitable first-time kinks, made the first several weeks challenging for both the students and the teaching team. That said, I am overall very pleased with the students’ performance. In a course that contained fairly advanced material, students overall did quite well in the problem sets, midterm and final exams. I was also very pleased with my team of teaching fellows (headed by an amazing undergraduate student – Juan Perdomo) that had to deal with teaching a new iteration of the course, including many concepts that they themselves weren’t so familiar with.

Perhaps the most significant change I made from the standard presentation is to make non uniform computation (specifically straightline programs / circuits with NAND gates) the initial computational model, rather than automata. I am quite happy with this choice and intend to keep it for the following reasons:

  • Boolean gates such as NAND have much tighter correspondence to actual hardware and convey the notion that this is not an arbitrary abstract model but intends to capture computation as is physically realizable.
  • Starting with a model for finite functions allows us to avoid dealing with infinity in the first few lectures.
  • Much of the conceptual lessons of the course  – that we can model computation mathematically, that we can encode an algorithm as a string and give it as input to other algorithms, that there is a universal algorithm, and that some functions are harder that others – can already be explained in the finite non uniform setting.
  • The non uniform model is crucial for talking about cryptography (e.g., explaining notions such as “128 bits of security” or giving a model where bitcoin proofs of work make sense), pseudorandomness and quantum computing.  Cryptography and pseudorandomness are the most compelling examples of “mining hardness” or “making lemonade out of the computational difficulty lemon” which is a core take away concept. Further  I believe that it is crucial to talk about quantum computing in a course that aims to model computation as it exists in the world we live in.
  • A more minor point is that the non uniform model and the notion of “unrolling the loop” to simulate a uniform computation by a non uniform one, makes certain proofs such as Cook-Levin Theorem, Godel’s Incompleteness Theorem,  and BPP in P/poly,  technically much easier.

 

So how did the students like it? The overall satisfaction with the course was 3.6 (in a 1-5 scale) which gives me one more reason to be thankful that I’m a tenured professor and not an Uber driver. On the positive side, 60% of the students rated the course as “very good” or “excellent” and 83% rated it as “good”,”very good” or “excellent”.

Here are the student answers to  the question “What did you learn from this course? How did it change you?”. As expected, they are a mixed bag. One answer was “I learned about the theory of computation. This course made me realize I do not want to study the theory of computation. ” which I guess means that the course helped this student fulfill the ancient goal of knowing thyself.

In terms of difficulty and workload, 44% of the students found it “difficult” and 23% found it “very difficult” which is a little (but not significantly) more than the average difficulty level for CS classes at Harvard. While the mean amount of hours (outside lectures) spent on this course per week was 11.6 (par for the course in CS classes),  you don’t need the sum of squares algorithm to see that the distribution is a mixture model:

cs121workload

I imagine that the students with less math preparation needed to work much more (but perhaps also gained more in terms of their math skills).

 

Lessons learned for next time:

  • There is more work to be done on the text, especially to make it more accessible to students not used to reading mathematical definitions and notations. I plan to add more Sipser-style “proof ideas” to the theorems in the text, and add more plain English exposition, especially at the earlier chapters.
  • Many students got hung up on the details of the computational models, in particular my “Turing machine analog”  which was the NAND++ programming language. I need to find a way to strike the right balance between making sure there is a precise and well-defined model, and being able to properly prove theorems about it, and getting the broader point across that the particular details of the model don’t matter.
  • I find the idea of incorporating programming-languages based models in this course appealing, and have made some use of Jupyter notebooks in this course. I need to spend more thought on how to use these tools in the pedagogically most useful way.

Women In Theory – registration deadline getting closer

December 24, 2017

The deadline to register to the Women In Theory workshop is January 16, 2018.  As Omer Reingold posted, this is a wonderful workshop with a strong set of speakers (confirmed speakers include  Bonnie Berger, Yael Kalai, Julia Kempe,  Gillat Kol, Nancy Lynch, and Barna Saha). It is sure to have a great technical content, as well as be a wonderful experience thanks to the organization of Tal Rabin, Shubhangi Saraf
and Lisa Zhang.

The workshop will take place on June 19-22 2018 at Harvard university. I am a local co organizer with Madhu Sudan and Salil Vadhan. Having WIT at Harvard is brings back great memories for me, since I was involved in the first WIT at Princeton in 2008. In that workshop Gillat Kol, Barna Saha, and Shubhangi Saraf (now speakers and organizer) were participants.

I encourage any female graduate student in theoretical computer science to strongly consider applying for this workshop by filling the form here.

ITCS early registration deadline

December 22, 2017

Message from Costis, Yael, and Vinod:

ITCS is back in the east coast, and will be at MIT from January 11-14, 2018. As you know, ITCS is a conference that is unique in many respects: it’s a conference that emphasizes dialog and discussion among all sub-areas of TCS, facilitating it with a single track structure and “chair rants” providing the context for each session. Submissions, refereeing and presentations emphasize the “I” in ITCS: new concepts and models, new lines of inquiry, new techniques or novel use of existing techniques, and new connections between areas.

All in all, great fun! This year, ITCS will run for four full days with lots of activities. Tickets are going fast: the deadline for early registration and hotel block are both December 28, 2017.

A great tradition at ITCS is the “graduating bits” session, where graduating PhD students and postdocs give brief overviews of their research in advance of going out on the job market. If you fit the description, you should sign up here.

Following the success of the poster session at ITCS’17 and STOC’18, we will have one too, at the Marriott the first evening of the conference. To sign up, go here.

We hope to see many of you at ITCS!
 
Your local organizers,
 
Costis, Yael & Vinod