- Our Science article, The reusable holdout: Preserving validity in adaptive data analysis (joint with Cynthia Dwork, Vitaly Feldman, Moritz Hardt, Toniann Pitassi and Aaron Roth), is out.

—-

update (8/16/2015): On Privacy and Anonymisation (with Cynthia Dwork and Salil Vadhan) in The Economist

]]>

For the last two years I have tried new ideas in teaching algorithms at the graduate level. The course is directed at first year CS grads, but is also taken by grads from related disciplines, and many advanced undergrads. (Links to course homepage, and single file with all course materials.)

The course may be interesting to you if, like me, you are rethinking the traditional choice of topics. The following were my thoughts behind the redesign:

- The environment for algorithms design and use has greatly changed since the 1980s. Problems tend to be less cleanly stated (as opposed to “bipartite matching” or “maximum flow”) and often involve high-dimensional and/or noisy inputs. Continuous optimization is increasingly important.
- As the last theory course my students (grad or undergrad) might take for the rest of their lives, it should somewhat fill in holes in their undergraduate CS education: information/coding theory, economic utility and game theory, decision-making under uncertainty, cryptography (anything beyond the RSA cryptosystem), etc.
- Programming assignments need to be brought back! CS students like hands-on learning: an algorithm becomes real only once they see it run on real data. Also, computer scientists today —whether in industry or academia—rely on subroutine libraries and scripting languages. A few lines in Matlab and Scipy can be written in minutes and run on datasets of millions or billions of numbers. No JAVA or C++ needed! Algorithms education should weave in such powerful tools. It is beneficial even for theory students play with them.

Sample programming assignments: (a) (compression via SVD) given a 512 x 512 grayscale image, treat it as a matrix and take its rank k approximation via SVD, for k=15, 30,45,60. Use *mat2gray* in matlab to render this new matrix as a grayscale image and see what k suffices for realistic recovery. (b) You are given S&P stock price data for 10 years. run online gradient descent to manage a portfolio (Lecture 16), and report what returns you get with various parameter settings.

Students are allowed to do a final project in lieu of a final, and many choose to apply algorithms to some real world problem they are interested in. Sample projects are also listed on the course page.

I welcome your comments, suggestions, and links to other relevant course materials on the web!

]]>

Indeed, throughout my discussions with Harvard I have been consistently impressed by the commitment at all levels, from the President down, to apply all the considerable material and intellectual resources of Harvard to make it one of the absolute top places for Theoretical Computer Science and Computer Science at large. This is not just great for Harvard, but also for our field, and I am truly excited to play a role in this endeavor.

]]>

*“Overall, submissions related to multi-linear maps and indistinguishability obfuscation were held to a somewhat higher standard. The PC expressed some concern with the recent flurry of activities pertaining to multi-linear maps and indistinguishability obfuscation, given how little we understand and can say and *prove* about the underlying hardness assumptions”.*

This comment was clearly written with the best of intentions, to explain views expressed at the PC deliberations. And I’m thankful to it – mainly since it made the underlying misconceptions so explicit that it mandated a response. So, after discussing and commiserating with colleagues here at Simons, and after amusing ourselves with some analogues of above statement (e.g., “results on NP completeness are held to a higher standard given how little we understand and can say and ***prove*** about the hardness solving SAT in polynomial time”), I decided to try to write an – obviously subjective – account for the recent developments in multilinear maps and indistinguishability obfuscation (IO) and why this exciting research should be embraced and highlighted rather than “held to a somewhat higher standard” — in spite of how little we understand about the underlying assumptions. The account is aimed at the general CS-theorist.

Let me start by giving rough definitions of the concepts involved. An Indistinguishability Obfuscator (IO) is a randomized algorithm O that takes as input a circuit C and outputs a (distribution over) circuits O(C) with the properties that:

- C and O(C) have the same functionality,
- O(C) is only polynomially larger than C
- for any two same-size, functuinally equivalent circuits C and C’ we have that O(C) ~ O(C’) (i.e., the distributions over strings representing O(C) and O(C’) are computationally indistinguishable).

IO has been proposed as a notion of obfuscation in 2000 (Hada, Barak-Goldreich-Impagliazzo-Sahai-Vadhan-Yang). Indeed, it is arguably a clean and appealing notion – in some sense the natural extension of semantic security of standard encryption to “functionality-preserving encryption of programs”. However, it has been largely viewed as too weak to be of real applicability or interest. (There were also no candidate polytime IO schemes, but this in my eyes is a secondary point, see below.)

Things changed dramatically in 2013 when Sahai and Waters demonstrated how IO schemes can be ingeniously combined with other rather “mundane” cryptographic constructs to do some amazing things. Since then dozens of papers came about that extend the SW techniques and apply them to obtain even more amazing things – that by now have transcended crypto and spilled over to other areas. (e.g.: deniable encryption, succinct delegation, succinct multi-party computation with hardly any interaction, one message succinct witness hiding and witness indistinguishable proofs, hash functions with random-oracle-like properties, hardness results for PPAD, and many more). In fact, think about a result in your area that assumes that some computation is done inside a black box – most probably IO can replace that assumption in one way or another…

Still, my (subjective but distinct) feeling is that we are far from understanding the limits and full power of IO. Furthermore, the study of IO has brought with it a whole new toolbox of techniques that are intriguing in their own right, and teach us about the power and limitations of working with “encrypted computations”.

So far I have not mentioned any candidate constructions of IO – and indeed the above study is arguably valuable as a pure study of this amazing concept, even without any candidate constructions. (Paraphrasing Levin on quantum computers, one can take the viewpoint that the above is the study of impossibility results for IO…)

However, unlike quantum computers, here we also have candidate constructions. This is where multilinear maps come to play.

Multi-linear maps are this cool new technical tool (or set of tools) that was recently put forth. (The general concept was proposed by Boneh and Silverberg around 2000, and the first candidate construction of one of the current variants was presented in 2012 by Garg, Gentry and Halevi.) Essentially, a multilinear map scheme is a fully homomorphic encryption scheme where the public key provides, in addition to the ability to encrypt elements and perform homomorphic operations on ciphertexts, also the ability to partially decrypt ciphertexts under certain restrictions. There are many incomparable variants of this general paradigm, which differ both in the functionality provided and in the security guarantees. Indeed, variants appear to be closely tied to candidate constructions. Furthermore, our understanding of what’s possible here has been evolving considerably, with multiple new constructions, attacks, and fixes reported.

Still, the number and variety of applications of multi-linear maps makes it clear that this “family of primitives” is extremely powerful and well worth studying – both at the level of candidate constructions, at the level of finding the “right” computational abstractions, and at the level of applications. In a sense, we are here back to the 70’s: we are faced with this new set of algebraic and number theoretic tools, and are struggling to find good ways to use them and abstract them.

Indeed, some of the most powerful applications of multilinear maps are candidate constructions of IO schemes. The first such candidate construction (by Garg, Gentry, Halevi, Raykova, Sahai and Waters in 2013) came with only heuristic arguments for security; However more rigorous analyses of this and other constructions, based on well-defined formulations of multi-linear map variants, soon followed suite. Some of these analyses have eventually been “broken” in the sense that we currently don’t have candidate constructions that satisfy the properties they assume. Still, other analyses do remain valid. Indeed, there are no attacks against the actual basic IO scheme of Garg et al.

The fact that the only current candidate constructions of IO need to assume existence of some variant of multi-linear maps at some point or another may make it seem as it the two concepts are somehow tied together. However, there is no reason to believe that this is the case. For all we know, multi-linear maps are just the path first uncovered to IO, and other paths may well be found. Similarly, even if IO turns out to be unobtainable for some reason, the study of multilinear maps and their power will still remain very relevant.

So, to sum up this long-winded account:

- IO is a natural and fascinating computational concept. Studying its consequences (both within and outside cryptography) is a well worth endeavor.
- Studying new candidate constructions of IO and/or new analyses of their security is another well worth endeavor.
- Multilinear maps are an intriguing and powerful set of techniques and tools. Finding better candidate constructions and abstractions is of central importance to cryptography. Finding new cool uses of these maps is another intriguing challenge.
- The three should be treated as separate (although touching and potentially interleaving) research efforts.

———–

I’d like to thank Guy Rothblum and Vinod Vaikuntanathan for great comments that significantly improved this post.

]]>

__Popular TOC books__:

Scott Aaronson, *Quantum Computing since Democritus*

Martin Davis, *Engines of Logic: Mathematicians and the Origin of the Computer*

A. K. Dewdney, *The New Turing Omnibus: Sixty-Six Excursions in Computer Science*

David Harel, *Computers Ltd.: What They Really Can’t Do*

David Harel with Yishai Feldman, *Algorithmics: The Spirit of Computing*

Douglas Hofstadter: *Gödel, Escher, Bach: An Eternal Golden Braid *

Lance Fortnow, *The Golden Ticket: P, NP, and the Search for the Impossible*

Cristopher Moore and Stephan Mertens, *The Nature of Computation*

Dennis Shasha and Cathy Lazere, *Out of their Minds: The Lives and Discoveries of 15 Great Computer Scientists*

Leslie Valiant, *Probably Approximately Correct: Nature’s Algorithms for Learning and Prospering in a Complex World*

Leslie Valiant, *Circuits of the Mind*

Noson S. Yanofsky,* The Outer Limits of Reason: What Science, Mathematics, and Logic Cannot Tell Us*

Hector Zenil,* Randomness Through Computation: Some Answers, More Questions*

__Fiction__

Apostolos Doxiadis and Christos Papadimitriou, *Logicomix: An epic search for truth*

Christos H. Papadimitriou, *Turing (A Novel about Computation)*

__Other Resources__:

CS Unplugged (including a book)

]]>

This post is meant to record some of my observations.

**How I felt**: The first two weeks felt a bit stressful and burdensome. But once I got used to it, I started enjoying the lectures and it was indeed quite pleasing to hear (and in some cases see) that a good fraction of the students liked the material and see them participating in class.

**Hindsight**: The most significant point was the level of the assignments. Here I erred mainly due to a mismatch in expectations. The first assignment, the median was 100% so I increased the level. The next one was at 77% which still felt high and not challenging enough for the students. At this point I consciously had 50% of each assignment be moderately easy problems (directly based on class work) and the remaining 50% range from not-so-easy to problems requiring at least one new idea. While perhaps the concept was right, the proportions were off from what the students expected. A 80-20 or so split would have been much better in hindsight. I got it almost right for the final with the median being 75%.

There were no real surprises in the syllabus covered with most topics being in common with other similar classes (you can compare here: Harvard, MIT 1, MIT2, MIT 3, CMU 1, CMU 2, Stanford 1, Stanford 2, Coursera-Stanford). However, it did feel a little ambitious in the end and the content needs some pruning. For instance, I spent one lecture each on three somewhat non-standard topics – analyzing sampling methods, contention resolution and cuckoo hashing. For the next time perhaps covering one of them or even none would be better.

A few people asked to include a programming component in the course. This makes perfect sense and I indeed considered it seriously at the beginning and thought about doing something like what Jelani Nelson used at Harvard. But it was plainly infeasible to have programming components in the assignments with the available resources (Jelani tells me he had 10 TAs for a class of about 180). Perhaps for the next time around I can suggest problems for students to play with even if they won’t be graded.

One other request was for practice midterm/final questions. I am still undecided about this one.

**Proofs**: I spent a lot of time in class proving that various (in some cases extremely simple) algorithms work. This is not an exception for this course, but seems to be true for most similar courses (check the syllabi: Harvard, MIT 1, MIT2, MIT 3, CMU 1, CMU 2, Stanford 1, Stanford 2, Coursera-Stanford).

So, as a few students asked, why so much emphasis on proofs in an algorithms class? There are two separate issues here. First, perhaps my not-so-clear presentation (this is the first run after all). Let us separate that from the second, probably more pressing one – if the goal of an algorithms course is to develop algorithmic thinking and/or prepare the students mainly for a career in software engineering, why should we (by we I mean all algorithms courses across the universities) emphasize proofs?

First, which proofs did I spend a lot of time doing? Well, there was 1) BFS/DFS, 2) FFT, 3) Minimum spanning trees, 4) Sampling, 5) Quicksort, 6) Hashing.

BFS/DFS we can explain as they serve as examples to illustrate induction, invariants etc. For FFT, the algorithm and the proof are one and the same – you can’t quite come up with the algorithm without the proof. But how about the others?

Take MST, Quicksort, hashing. With the right questions, you can motivate students to come up with the algorithms themselves as they are indeed quite natural and simple. But shouldn’t that be the end of developing algorithmic thinking? Same goes for Quicksort, hashing. Randomized divide & conquer makes intuitive sense and so does making random choices when in doubt. Why go deeply into probability, linearity-of-expectation to analyze these? Here are two worthwhile reasons (among many) I can think of.

First, speed is not everything – we need to be sure that the algorithm works. At the end of the day, even when you just want to build something hands on, in many cases you need to be absolutely sure that what you have actually works. For example, it is easy to come up with examples where greedy fails. In class I did do an example (knapsack) where greedy strategy fails. However, looking back I should have emphasized it more and drawn a parallel with other examples where greedy fails.

Second, the goal of the course is not just to help with programming faster code but also to serve as a launching pad for a path to computer science (with emphasis on the ‘science’). Even in the former case, thinking about algorithms in a principled way and being able to analyze them will eventually help in designing new algorithms; especially so when you have to tweak existing algorithms for new applications. Looking back, including more examples to demonstrate this concept would have been quite helpful.

]]>

This week, the lecture series on historical papers in cryptography continues, now complete with its own webpage and video links. From Vinod: “we will hear about the love affair between quantum computing and cryptography through the words of the inimitable Umesh Vazirani. Everyone’s invited”.

If you’re in the greater Berkeley area, please do drop by. Details below.

**Quantum and Post-Quantum Cryptography**

Speaker: Umesh Vazirani (UC Berkeley)

Date: Monday June 22, 2-3:30pm

Location: Calvin Lab Auditorium

This talk will trace the fundamental impact of quantum computation on cryptography, including the breaking of classical cryptostems such as RSA by quantum algorithms and, remarkably, the use of quantum algorithms to design and establish security of other classical cryptosystems. I will also describe how novel features of quantum states have been exploited to create quantum cryptographic primitives, and the challenges in defining and establishing security of such primitives. The talk is aimed at a general audience and will not assume any background in quantum computation.

]]>

—————

Throughout our conference design process we often observe big shifts in people’s opinions as they engage with the issues and the mathematical constraints. So if you have strong opinions about the theory festival, I highly recommend spending half an hour trying to come up with your own design.

Before and during your design, answer the following questions to yourself about the event you are planning:

- How is the event appealing to theorists who currently don’t come?
- How is the event creating more interaction opportunities?
- Part of the target audience wants
**more**signal from the PC (= more power), and part of the target audience wants to give**less power**to the PC because they disagree with its past decisions and general preferences. Which direction does your plan go in?

Keep in mind also the general equilibrium view:

(i) People worry about the effect of any change in the conferences on hiring/promotion/grant applications. The general equilibrium view says that if you double or halve the total number of STOC papers (“the money supply”) its only effect will be to double/halve the number of publications required to get the job or the grant. So what should determine the total number of accepts in your design?

(ii) The net attention of attendees is unchanged. X-minute talks in 4 parallel sessions use up the same amount as X/2-minute talks in 2 parallel sessions. Which do you prefer —as author and as attendee—and why?

Of course you could argue that you can change the equilibrium by causing more jobs/grants to be created, or by increasing the number of attendees. In that case, please state your assumptions.

Have a go at it, and if you come up with interesting designs, please sketch them in your comments!

———————————————————————————

Data:

Currently: 90 accepts; 20 min talks in 2 parallel sessions (about 16-17 hrs)

Essentially no plenary. 1 separate day of workshops.

Our designs assume at least 12 plenary hrs, 2 hrs of tutorial, 1 day of workshops** **(all distributed over 5 days). Plus, two hours for lunch and an evening poster session.

Remember to allow for changeover time between speakers.

]]>

————–

An important part of the plan for theory festival —which everybody involved agrees upon—is the need for a substantial plenary component. The festival organizing committee would select the plenary program based upon inputs from various sources.

Plenary sessions will include about 20-25 short talks from a broad spectrum of “Theory” subcommunities, including (but not limited to) SODA, CCC, COLT, CRYPTO, KDD, EC, PODS, PODC, etc., as well as STOC and FOCS. We envisage some kind of nomination process whereby these communities/PCs could propose recent papers presented at their recent conferences which would be of interest to a broader theory audience. Sometimes they could nominate an older paper that is now generating a lot of work, or a survey talk.

Plenary sessions would also include 1-hr lectures introducing an area in science, social science, or mathematics of interest to a broad theory audience. I could’ve generate some sample topics myself, but in interest of fun I decided to ask for suggestions from a small group of people. (I’ve reworded/shortened their answers.)

Silvio Micali*: Connectomics* (figuring out the graph of interconnections of the brain’s neurons from imaging data).

Scott Aaronson: (a) Recent work connecting complexity, quantum information and quantum gravity (Harlow, Hayden, Preskill etc.); it is creating waves (b) Theorist-friendly introduction to deep nets and deep learning.

Ankur Moitra: Linear Inverse Problems: recovering an object from linear measurements (includes robust PCA, matrix completion, phase retrieval, tensor completion, etc. May have interesting connections to SDPs and other convex methods studied in our community.

Suresh Venkatsubramanian: (a) Computational Topology. Motivated by data analysis, it has completely taken over what used to be called computational geometry. STOC/FOCS people might be able to provide approximation algorithms for topological quantities. (b) Optimization: a basic primitive in many applied settings, especially machine learning. Esoteric optimization ideas like momentum and regularizers are now ubiquitous in applications, but haven’t affected STOC/FOCS world much (except for recent work on flows).

In your comments, please send other suggestions for talks that might be interesting.

Remember, the festival will also have a separate slot for *technical *tutorials on interesting topics within CS and theoretical CS. Also, some workshops may feature their own invited/plenary talks.

]]>

*STOC Festival Design*: **Improving interaction and fun factor; reducing information overload.**

**Sanjeev Arora**

*How can we increase the value added by a conference in today’s information-rich world, when papers have been available on arxiv for months to the experts in that area? *

These are some personal thoughts (ie I am not representing the committee or SIGACT).

First, I wish to make a plug for poster sessions at STOC: all papers should also be presented at an evening poster session. If you missed a talk, you can get the 2-5 min (or longer!) version at the poster session, tailored to your level of prior knowledge and speed of comprehension. (Remember, theory says that 2-way communication is exponentially more efficient than one-way!) Poster presenters —often students and junior researchers—will get a chance to meet others, especially senior researchers. Ideas and email addresses will get exchanged. (Currently I talk to approximately zero students at the conference— certainly, nothing facilitates it.) Also, different coauthors could present the talk and the poster, which **doubles** the number of people presenting at the conference.

Second, conferences should do a better job to help us navigate today’s sea of information. (As Omer notes in his post, we can decouple the “journal of record” role of STOC with the actual program of talks.) The current format of 95+ talks of 20 min is very fatiguing, and it is hard to figure out “What to do if my attention span only allows N talks?” Arguably, this question can be answered by the PC, but that signal is deliberately **discarded** and hidden from the attendees. One way to reveal this signal would be to schedule talks of different lengths. For example, with 130 accepts one could have 8 talks of 20 minutes in plenary sessions, 48 talks of 20 minutes in two parallel sessions, and 74 talks of 5-minutes each in two parallel sessions. (And all papers would also be presented in poster sessions.)

Benefits: (a) Allows **substantial** increase in number of accepts to 130 while staying with **two parallel sessions**. (b) May lead to a less risk-averse PC (ie more diverse conference) while maintaining a very high-quality core. (c) Attendees get to tailor their consumption of content. (d) A 5-minute talk is still enough for the presenter to give a sense of the work and publicize it. Each attendee gets exposed to ½ of the overall program instead of a 1/3^{rd}; this is **efficient use** of their attention span.

Possible objections: (a) Effect on tenure/promotion. (b) Noisiness of the signal (c) Authors are worse off.

I think (a) will become a non-issue. If today’s tenure case has X STOC papers, tomorrow’s might have X/2 papers with 20-min talks and X with 5-min talks (b) Yes, PCs are 100% fallible, but weigh that against all the benefits above. If we don’t believe in PC judgement we might as well disband STOC.

For (c), let’s do quick a Pareto analysis. The comparison plan on the table is 95 accepts: 8 plenary talks + 87 talks of 20 min in **three** parallel sessions. (We need three sessions because of the substantial plenary component being added.)

With 130 accepts the turnout will be higher; perhaps 25% higher. Authors are trying to maximize the number of people exposed to their paper. The basic math is that ½ of 125% is roughly **twice** of 1/3^{rd} of 100%. We’ll see that all authors are much better off in this proposal, except those whose paper had a nominal “rank” of 57-95 in the PC process, who both gain and lose.

Rank 1-8: Somewhat **better** off (125% vs 100%)

Rank 9-56: Significantly **better** off (62% vs 33%).

Rank 57-95: **Gain and lose. ** (An audience of 62% instead of 33%, but 5 min talk instead of 20 min.).

Rank 96-130: Significantly **better** off. Their paper gets into proceedings, and they get 5 min to pique the interest of 62% of the audience (without waiting half a year to resubmit).

Every ex-PC member will tell you that papers that end up in the 3^{rd} category were equally likely to be in the 4^{th}, and vice versa. Knowing this, rational authors should prefer this new plan. It makes smarter use of a scarce resource: attendees’ attention span.

]]>