It is hard to overestimate the impact of Popular Science books such as “A Brief History of Time” and “Chaos: Making a New Science” on Scientific Research. The indirect impact of popularizing Science and Scientific Education often surpass the direct contribution that most scientists can hope to achieve in their life time. For this reason, many of the greatest scientists (including in our field) choose to invest considerable time in this blessed endeavor. I personally believe that the Theory of Computing deserves more popularization than it gets (and I hope to someday contribute my share). Nevertheless, this post is meant as a tribute to our colleagues who already made wonderful such contributions. I will continuously edit this post with TOC popular books and educational resources (based on my own knowledge and suggestions in the comments).
Popular TOC books:
Scott Aaronson, Quantum Computing since Democritus
Martin Davis, Engines of Logic: Mathematicians and the Origin of the Computer
A. K. Dewdney, The New Turing Omnibus: Sixty-Six Excursions in Computer Science
David Harel, Computers Ltd.: What They Really Can’t Do
David Harel with Yishai Feldman, Algorithmics: The Spirit of Computing
Douglas Hofstadter: Gödel, Escher, Bach: An Eternal Golden Braid
Lance Fortnow, The Golden Ticket: P, NP, and the Search for the Impossible
Cristopher Moore and Stephan Mertens, The Nature of Computation
Dennis Shasha and Cathy Lazere, Out of their Minds: The Lives and Discoveries of 15 Great Computer Scientists
Leslie Valiant, Probably Approximately Correct: Nature’s Algorithms for Learning and Prospering in a Complex World
Leslie Valiant, Circuits of the Mind
Noson S. Yanofsky, The Outer Limits of Reason: What Science, Mathematics, and Logic Cannot Tell Us
Hector Zenil, Randomness Through Computation: Some Answers, More Questions
Apostolos Doxiadis and Christos Papadimitriou, Logicomix: An epic search for truth
Christos H. Papadimitriou, Turing (A Novel about Computation)
CS Unplugged (including a book)
I taught my first class last quarter and it was an enjoyable and eye-opening experience at many levels. First some background. The class was undergraduate algorithms or as popularly known in UCLA – CS180. There were 129 students (kind of like jumping into the deep end to test the waters). Like most other CS curricula, it is a core required course and as I later heard from the students, the class can have a significant impact on where you intern or even get employed eventually (all software companies want to know how you did in this course).
This post is meant to record some of my observations.
How I felt: The first two weeks felt a bit stressful and burdensome. But once I got used to it, I started enjoying the lectures and it was indeed quite pleasing to hear (and in some cases see) that a good fraction of the students liked the material and see them participating in class.
Hindsight: The most significant point was the level of the assignments. Here I erred mainly due to a mismatch in expectations. The first assignment, the median was 100% so I increased the level. The next one was at 77% which still felt high and not challenging enough for the students. At this point I consciously had 50% of each assignment be moderately easy problems (directly based on class work) and the remaining 50% range from not-so-easy to problems requiring at least one new idea. While perhaps the concept was right, the proportions were off from what the students expected. A 80-20 or so split would have been much better in hindsight. I got it almost right for the final with the median being 75%.
There were no real surprises in the syllabus covered with most topics being in common with other similar classes (you can compare here: Harvard, MIT 1, MIT2, MIT 3, CMU 1, CMU 2, Stanford 1, Stanford 2, Coursera-Stanford). However, it did feel a little ambitious in the end and the content needs some pruning. For instance, I spent one lecture each on three somewhat non-standard topics – analyzing sampling methods, contention resolution and cuckoo hashing. For the next time perhaps covering one of them or even none would be better.
A few people asked to include a programming component in the course. This makes perfect sense and I indeed considered it seriously at the beginning and thought about doing something like what Jelani Nelson used at Harvard. But it was plainly infeasible to have programming components in the assignments with the available resources (Jelani tells me he had 10 TAs for a class of about 180). Perhaps for the next time around I can suggest problems for students to play with even if they won’t be graded.
One other request was for practice midterm/final questions. I am still undecided about this one.
Proofs: I spent a lot of time in class proving that various (in some cases extremely simple) algorithms work. This is not an exception for this course, but seems to be true for most similar courses (check the syllabi: Harvard, MIT 1, MIT2, MIT 3, CMU 1, CMU 2, Stanford 1, Stanford 2, Coursera-Stanford).
So, as a few students asked, why so much emphasis on proofs in an algorithms class? There are two separate issues here. First, perhaps my not-so-clear presentation (this is the first run after all). Let us separate that from the second, probably more pressing one – if the goal of an algorithms course is to develop algorithmic thinking and/or prepare the students mainly for a career in software engineering, why should we (by we I mean all algorithms courses across the universities) emphasize proofs?
First, which proofs did I spend a lot of time doing? Well, there was 1) BFS/DFS, 2) FFT, 3) Minimum spanning trees, 4) Sampling, 5) Quicksort, 6) Hashing.
BFS/DFS we can explain as they serve as examples to illustrate induction, invariants etc. For FFT, the algorithm and the proof are one and the same – you can’t quite come up with the algorithm without the proof. But how about the others?
Take MST, Quicksort, hashing. With the right questions, you can motivate students to come up with the algorithms themselves as they are indeed quite natural and simple. But shouldn’t that be the end of developing algorithmic thinking? Same goes for Quicksort, hashing. Randomized divide & conquer makes intuitive sense and so does making random choices when in doubt. Why go deeply into probability, linearity-of-expectation to analyze these? Here are two worthwhile reasons (among many) I can think of.
First, speed is not everything – we need to be sure that the algorithm works. At the end of the day, even when you just want to build something hands on, in many cases you need to be absolutely sure that what you have actually works. For example, it is easy to come up with examples where greedy fails. In class I did do an example (knapsack) where greedy strategy fails. However, looking back I should have emphasized it more and drawn a parallel with other examples where greedy fails.
The cryptography semester at the Simons Institute is well on its way. Last week we had a fascinating workshop on securing computation: thanks to Hugo Krawczyk and Amit Sahai for organizing. You can find the program and video links here (covering, among many other topics, everything you always wanted to know about obfuscation but were afraid to ask). Beyond the tremendous energy and excitement about cryptography research, participants have also been keeping busy with regular movie nights, swing dancing lessons, playback theater, volleyball and hiking adventures.
This week, the lecture series on historical papers in cryptography continues, now complete with its own webpage and video links. From Vinod: “we will hear about the love affair between quantum computing and cryptography through the words of the inimitable Umesh Vazirani. Everyone’s invited”.
If you’re in the greater Berkeley area, please do drop by. Details below.
Quantum and Post-Quantum Cryptography
Speaker: Umesh Vazirani (UC Berkeley)
Date: Monday June 22, 2-3:30pm
Location: Calvin Lab Auditorium
This talk will trace the fundamental impact of quantum computation on cryptography, including the breaking of classical cryptostems such as RSA by quantum algorithms and, remarkably, the use of quantum algorithms to design and establish security of other classical cryptosystems. I will also describe how novel features of quantum states have been exploited to create quantum cryptographic primitives, and the challenges in defining and establishing security of such primitives. The talk is aimed at a general audience and will not assume any background in quantum computation.
Sanjeev suggesting an interesting exercise, in our series on the design of a Theory Festival as part of STOC 2017:
Throughout our conference design process we often observe big shifts in people’s opinions as they engage with the issues and the mathematical constraints. So if you have strong opinions about the theory festival, I highly recommend spending half an hour trying to come up with your own design.
Before and during your design, answer the following questions to yourself about the event you are planning:
- How is the event appealing to theorists who currently don’t come?
- How is the event creating more interaction opportunities?
- Part of the target audience wants more signal from the PC (= more power), and part of the target audience wants to give less power to the PC because they disagree with its past decisions and general preferences. Which direction does your plan go in?
Keep in mind also the general equilibrium view:
(i) People worry about the effect of any change in the conferences on hiring/promotion/grant applications. The general equilibrium view says that if you double or halve the total number of STOC papers (“the money supply”) its only effect will be to double/halve the number of publications required to get the job or the grant. So what should determine the total number of accepts in your design?
(ii) The net attention of attendees is unchanged. X-minute talks in 4 parallel sessions use up the same amount as X/2-minute talks in 2 parallel sessions. Which do you prefer —as author and as attendee—and why?
Of course you could argue that you can change the equilibrium by causing more jobs/grants to be created, or by increasing the number of attendees. In that case, please state your assumptions.
Have a go at it, and if you come up with interesting designs, please sketch them in your comments!
Currently: 90 accepts; 20 min talks in 2 parallel sessions (about 16-17 hrs)
Essentially no plenary. 1 separate day of workshops.
Our designs assume at least 12 plenary hrs, 2 hrs of tutorial, 1 day of workshops (all distributed over 5 days). Plus, two hours for lunch and an evening poster session.
Remember to allow for changeover time between speakers.
Towards the business meeting, another personal post in our series (this one by Sanjeev Arora):
An important part of the plan for theory festival —which everybody involved agrees upon—is the need for a substantial plenary component. The festival organizing committee would select the plenary program based upon inputs from various sources.
Plenary sessions will include about 20-25 short talks from a broad spectrum of “Theory” subcommunities, including (but not limited to) SODA, CCC, COLT, CRYPTO, KDD, EC, PODS, PODC, etc., as well as STOC and FOCS. We envisage some kind of nomination process whereby these communities/PCs could propose recent papers presented at their recent conferences which would be of interest to a broader theory audience. Sometimes they could nominate an older paper that is now generating a lot of work, or a survey talk.
Plenary sessions would also include 1-hr lectures introducing an area in science, social science, or mathematics of interest to a broad theory audience. I could’ve generate some sample topics myself, but in interest of fun I decided to ask for suggestions from a small group of people. (I’ve reworded/shortened their answers.)
Silvio Micali: Connectomics (figuring out the graph of interconnections of the brain’s neurons from imaging data).
Scott Aaronson: (a) Recent work connecting complexity, quantum information and quantum gravity (Harlow, Hayden, Preskill etc.); it is creating waves (b) Theorist-friendly introduction to deep nets and deep learning.
Ankur Moitra: Linear Inverse Problems: recovering an object from linear measurements (includes robust PCA, matrix completion, phase retrieval, tensor completion, etc. May have interesting connections to SDPs and other convex methods studied in our community.
Suresh Venkatsubramanian: (a) Computational Topology. Motivated by data analysis, it has completely taken over what used to be called computational geometry. STOC/FOCS people might be able to provide approximation algorithms for topological quantities. (b) Optimization: a basic primitive in many applied settings, especially machine learning. Esoteric optimization ideas like momentum and regularizers are now ubiquitous in applications, but haven’t affected STOC/FOCS world much (except for recent work on flows).
In your comments, please send other suggestions for talks that might be interesting.
Remember, the festival will also have a separate slot for technical tutorials on interesting topics within CS and theoretical CS. Also, some workshops may feature their own invited/plenary talks.
STOC Festival Design: Improving interaction and fun factor; reducing information overload – guest post by Sanjeev Arora
[Yet another personal post in our series]
STOC Festival Design: Improving interaction and fun factor; reducing information overload.
How can we increase the value added by a conference in today’s information-rich world, when papers have been available on arxiv for months to the experts in that area?
These are some personal thoughts (ie I am not representing the committee or SIGACT).
First, I wish to make a plug for poster sessions at STOC: all papers should also be presented at an evening poster session. If you missed a talk, you can get the 2-5 min (or longer!) version at the poster session, tailored to your level of prior knowledge and speed of comprehension. (Remember, theory says that 2-way communication is exponentially more efficient than one-way!) Poster presenters —often students and junior researchers—will get a chance to meet others, especially senior researchers. Ideas and email addresses will get exchanged. (Currently I talk to approximately zero students at the conference— certainly, nothing facilitates it.) Also, different coauthors could present the talk and the poster, which doubles the number of people presenting at the conference.
Second, conferences should do a better job to help us navigate today’s sea of information. (As Omer notes in his post, we can decouple the “journal of record” role of STOC with the actual program of talks.) The current format of 95+ talks of 20 min is very fatiguing, and it is hard to figure out “What to do if my attention span only allows N talks?” Arguably, this question can be answered by the PC, but that signal is deliberately discarded and hidden from the attendees. One way to reveal this signal would be to schedule talks of different lengths. For example, with 130 accepts one could have 8 talks of 20 minutes in plenary sessions, 48 talks of 20 minutes in two parallel sessions, and 74 talks of 5-minutes each in two parallel sessions. (And all papers would also be presented in poster sessions.)
Benefits: (a) Allows substantial increase in number of accepts to 130 while staying with two parallel sessions. (b) May lead to a less risk-averse PC (ie more diverse conference) while maintaining a very high-quality core. (c) Attendees get to tailor their consumption of content. (d) A 5-minute talk is still enough for the presenter to give a sense of the work and publicize it. Each attendee gets exposed to ½ of the overall program instead of a 1/3rd; this is efficient use of their attention span.
Possible objections: (a) Effect on tenure/promotion. (b) Noisiness of the signal (c) Authors are worse off.
I think (a) will become a non-issue. If today’s tenure case has X STOC papers, tomorrow’s might have X/2 papers with 20-min talks and X with 5-min talks (b) Yes, PCs are 100% fallible, but weigh that against all the benefits above. If we don’t believe in PC judgement we might as well disband STOC.
For (c), let’s do quick a Pareto analysis. The comparison plan on the table is 95 accepts: 8 plenary talks + 87 talks of 20 min in three parallel sessions. (We need three sessions because of the substantial plenary component being added.)
With 130 accepts the turnout will be higher; perhaps 25% higher. Authors are trying to maximize the number of people exposed to their paper. The basic math is that ½ of 125% is roughly twice of 1/3rd of 100%. We’ll see that all authors are much better off in this proposal, except those whose paper had a nominal “rank” of 57-95 in the PC process, who both gain and lose.
Rank 1-8: Somewhat better off (125% vs 100%)
Rank 9-56: Significantly better off (62% vs 33%).
Rank 57-95: Gain and lose. (An audience of 62% instead of 33%, but 5 min talk instead of 20 min.).
Rank 96-130: Significantly better off. Their paper gets into proceedings, and they get 5 min to pique the interest of 62% of the audience (without waiting half a year to resubmit).
Every ex-PC member will tell you that papers that end up in the 3rd category were equally likely to be in the 4th, and vice versa. Knowing this, rational authors should prefer this new plan. It makes smarter use of a scarce resource: attendees’ attention span.
[Boaz’s note – this is another in the series of personal posts on STOC/FOCS reform, this time from Moshe Vardi, a renowned theoretical computer scientist who is also the editor in chief of the communications of the ACM. See also the discussion that’s still going on in the comment section of Omer Reingold’s post, as well all discussions under this tag.]
Why Doesn’t ACM Have a SIG for Theoretical Computer Science?
Wikipedia defines Theoretical Computer Science (TCS) as as the “division or
subset of general computer science and mathematics that focuses on more
abstract or mathematical aspects of computing.” This description of TCS
seems to be rather straightforward, and it is not clear why there should be
geographical variations in its interpretation. Yet in 1992, when Yuri
Gurevich had the opportunity to spend a few months visiting a number of
European centers of research center, he wrote in his report, titled “Logic
Activities in Europe” that “It is amazing, however, how different computer
science is, especially theoretical computer science, in Europe and the US.”
(Gurevich was preceded by E.W. Dijkstra, who wrote an EWD Note 611 “On the
fact that the Atlantic Ocean has two sides.”)
This different between TCS in the US (more generally, North America) and
Europe is often described by insiders as “Volume A” vs “Volume B”, referring
to the Handbook of Theoretical Computer Science, published in 1990, with
Jan van Leeuwen as editor. The Handbook consisted of two volumes: Volume A,
focusing on algorithms and complexity, and Volume B, focusing on formal
models and semantics. In other words, Vol. A is the theory of algorithms,
while Volume B is the theory of software. North American TCS tends to be
quite heavily Volume A, while European TCS tends to encompass both
Volume A and Volume B. Gurevich’s report was focused on on activities of
the Volume-B type, which is sometimes referred to as “Eurotheory”.
Gurevich expressed his astonishment to discover the stark different
between TCS across the two sides of the Atlantic, writing that
“The modern world is quickly growing into a global village.”
And yet the TCS gap between the US and Europe is quite sharp. To see it,
one only has to compare the programs of the North American premier TCS
conferences–IEEE Symposium on Foundations of Computer Science (FOCS)
and ACM Symposium on Theory of Computing (STOC)–with that of Europe’s
premier TCS conference, Automata, Languages, and Programming (ICALP).
In spite of its somewhat anachronistic name, ICALP today has three tracks
with quite a broad coverage.
How did such a sharp division arose between TCS in North America and Europe?
This division did not exist prior to the 1980s. In fact, the tables of
content of the proceedings of FOCS and STOC from the 1970s reveal an
surprisingly (from today’s perspective) high level of Volume-B content. In
the 1980s, the level of TCS activities in North America grew beyond the
capacity of two annual single-track three-day conferences, which led to the
launching of what was known then as “satellite conferences”. Having shed
“satellite” topics, allowed FOCS and STOC to specialize, developing a
narrower focus on TCS. But the narrow focus of STOC and FOCS, the two
premier North American conferences, in turn has influenced what is
considered TCS in North America.
It is astonishing therefore to realize that the turn “Eurotheory” is
used somewhat derogatorily, implying a narrow and esoteric focus for
European TCS. From my perch as Editor-in-Chief for Communications, today’s
spectrum of TCS is vastly broader than what is revealed in the programs of
FOCS and STOC. The issue is no longer Volume A vs Volume B or Northern
America vs Europe (or other emerging centers of TCS around the world),
but rather the broadening gap between the narrow focus of FOCS and STOC
and the broadening scope of TCS. It is symptomatic indeed that unlike
the European Association for Theoretical Computer Science, ACM has no
Special Interest Group (SIG) for TCS. ACM does have SIGACT, a Special
Interest Group for Algorithms and Complexity Theory, but its narrow focus
is already revealed in its name [Paul Beame comments below that SIGACT stands for “Algorithms and Computation Theory” -B.].
This discussion is not of sociological interest only. The North
American TCS community has been discussing over the past few years
possible changes to the current way of running its two conferences,
considering folding FOCS and STOC into a single annual conference of
longer duration. A May 2015 blog entry by Boaz Barak is titled “Turning
STOC 2017 into a ‘Theory Festival'”. The proposal is focused on changing
the conference from the standard recitation of fast-paced research talks,
to a richer scientific event, with invited talks, workshops and tutorials,
social activities, poster and rump session, and the like.
I like very much the proposed directions for FOCS/STOC, but I’d also like to
see the North American TCS community show a deeper level of reflectiveness
on the narrowing of their research agenda, starting with the question posed
in the title of this editorial: Why doesn’t ACM have a SIG for Theoretical