Still, while we are all taking down our TheoryFest trees, we can remind ourselves that there are other holidays on the theory calendar. In particular FOCS 2018 will be in October in Paris, and it also contains a “workshop and tutorial day”. If you want to organize a workshop or tutorial, see the call for proposals. Key points are:

- Workshop and Tutorial Day:
**Saturday, October 6, 2018**(Paris) - Workshop and Tutorial Co-Chairs:
**Robert Kleinberg**and**James R. Lee** - Submission deadline:
**August 1st, 2018** - Notification:
**August 6th, 2018** - Send proposals and questions to
**focs2018workshops@gmail.com**

There are worse things in life than organizing a workshop in Paris..

]]>

I just got back from dinner with some of the great speakers who will be at our TheoryFest workshop tomorrow afternoon on computational thresholds for average-case problems, and I am very excited for what’s coming! Since I didn’t get much chance to introduce the speakers in my last post, and not all of them are the “usual suspects” for a STOC workshop, I’d like to take a few paragraphs to do so here, and discuss a little further what they might speak about.

The talks will run the gamut from statistical physics to machine learning to good old theory of computing, and in particular will aim to address questions at their 3-wise intersection. This intersection is full of open problems which are both interesting and approachable. I hope to see lots of people there!

On to the speakers:

**Florent Krzakala: **Florent is a statistical physicist at the Sorbonne in Paris. For more than 10 years he has been one of the leaders in studying high-dimensional statistical inference and average-case computational problems through the lens of statistical physics.

One of my favorite lines Florent’s work was the construction (with several others) of random measurement matrices and corresponding sparse-signal reconstruction algorithms which can recover a random sparse vector with nonzero entries from the measurement , *where the dimension of is only *. (That is, algorithms which recover a -sparse vector from only measurements — not , or even !) This involved bringing together insights from compressed sensing and spin-glass theory in a rather remarkable way.

Lately, Florent tells me he has been interested in the physics of matrix factorization and completion problems, and of neural networks. Tomorrow he is going to discuss computational-versus-statistical gaps for a variety of high-dimensional inference problems, addressing the question of why polynomial-time inference algorithms don’t necessarily achieve information-theoretically-optimal results from a statistical physics perspective.

**Nike Sun: **Nike is a probabilist & computer scientist at UC Berkeley, before which she was the Schramm postdoctoral fellow at MIT and MSR. Many of you may know her from the tour de force work (joint with Jian Ding and Allan Sly) establishing the -SAT threshold for large — this was the biggest progress in years on **the **problem in random CSPs.

Her research spans many topics in probability, but tomorrow she is discussing random CSPs, on which she is among the world experts. In this area her work has made great steps in rigorous-izing the predictions of statistical physics regarding the *geometry of the space of solutions to a random CSP instance. *This geometry is rich: at some clause densities random CSPs seem to have *well-connected *spaces of solutions, and at others the solution spaces are *shattered. *In between are a wealth of phase transitions whose algorithmic implications remain poorly understood (read: a great source of open problems!).

**Jacob Steinardt: **Jacob is a graduating PhD student at Stanford, and a rising star in provable approaches to machine learning. One of his focuses of late has been design of algorithms for* *learning problems in the presence of untrustworthy data, model misspecification, and other challenges the real world imposes on the idealized learning settings we often see here at STOC.

His paper with Moses Charikar and Greg Valiant on *list-learning *has attracted a lot of attention and sparked several follow-up works at this year’s STOC. In that paper, Jacob and his coauthors realized that the problem of learning parameters of a distribution when only a tiny fraction (say 0.001 percent) of your samples come from that distribution provides a generalization of and useful perspective on many classic learning problems, like learning mixture models. (In the list learning setting, one aims to output a *list *of candidate parameters such that one of the sets of parameters is close to the parameters of the distribution you wanted to learn.)

Jacob’s work has explored the notion that polynomial-time tractability of learning problems is intimately related to robustness. A caricature: if any learning problem solvable in polynomial time can also be solved in polynomial time in the presence of some untrustworthy data, then polynomial-time algorithms cannot solve inference problems which are *statistically impossible in the face of untrustworthy data. *This offers an intriguing explanation for the existence of average-case/statistical problems which are unsolvable by polynomial-time algorithms, in spite of their information-theoretic (i.e. exponential-time) solvability.

**I **will also give a talk.

See you there!

**Edit: **correct some attributions.

TheoryFest is in full swing in Los Angeles! There is lots to be written about the wealth of STOC talks, invited papers, keynotes, and (maybe most importantly) the excellent food hall across the street which is swarming with theorists and mathematicians. But for now I want to bring to your attention the workshops planned for Friday.

As always, the schedule is packed with interesting talks! There are workshops on major themes in theory, like

- deep learning
- fairness in algorithms
- recent constructions of extractors
- Vijay Vazirani’s birthday, and
- average-case problems and computational thresholds

Since I helped organize the last of these, let me say a bit about it.

Recently we have seen a surge of interest in algorithms and lower bounds for average-case problems. One driver of this surge has been theory’s increasing connection with machine learning, which has suggested a number of high-dimensional and noisy statistical inference problems to study. Another driver has been the substantial progress on long-standing problems regarding random instances—for example, a series of recent works proving many old conjectures on random *k*-SAT instances, and another series of works analyzing algorithms for community detection and recovery in very sparse random graphs.

A common theme in average-case computational problems is that their *statistical *difficulty (that is, whether they are solvable by any algorithm at all, regardless of running time) is not predictive of their *algorithmic *difficulty (that is, whether they are solvable by polynomial-time algorithms).

A now-standard example is the -community stochastic blockmodel. In this problem, nodes are randomly partitioned into communities, and then a sparse random graph of average degree is sampled on those vertices is sampled by including an edge with probability if are in differing communities and otherwise with probability (this produces a graph of average degree ).

It turns out that so long as for a big-enough constant , in exponential time one may recover the underlying communities from the graph (*approximately, *in the sense that it is possible to partition the graph into classes such that this partitioning has nonvanishing correlation with the ground-truth partition).

However, it is conjectured that this community recovery is possible in polynomial time *if and only if (!!) *. This threshold is conjectured to be *sharp*, in the sense that if no polynomial-time algorithm should exist, while one is known to when .

This kind of threshold phenomenon is ubiquitous for high-dimensional inference problems, and remains largely unexplained. Of course, we are used to algorithmic intractability! But the appearance of intractability for average-case problems seems not to be explained by worst-case theories of hardness (like NP-completeness). And very little is known about what features of an average-case problem are predictive of computational tractability, and which might predict intractability (while in the worst case we know that, say, NP-completeness is an excellent predictor of intractability). The *sharpness *of a threshold like is striking and mysterious in a field where ignoring logarithmic factors is practically a (inter)national pastime.

We will have four talks, spanning algorithms and complexity, with (at least) four perspectives on computational thresholds for average-case problems and related matters. **Florent Krzakala** will lead off with a statistical-physics perspective on algorithms and hardness for matrix and tensor factorization problems. **Nike Sun** will give us a taste of the wild world of random constraint satisfaction problems.** Jacob Steinhardt** will talk about the relationship of outlier robustness and inaccurate/partial problem specification to efficient algorithms for statistical inference. Finally, I will survey some recent developments at the intersection of convex programming, the Sum of Squares method, and high-dimensional inference problems.

I hope to see you there!

]]>I don’t think this is only about percentages. Regardless of ratios, I will personally be happy when I hear from my female colleagues and students that they feel fully included and respected. But we still have a long way to go to get there.

]]>There are many great papers, workshops, and other content in the program, but I am particularly looking forward to the Invited Short Plenary talks (and not just because I was involved in their selection..). I think this is a very unique program not just in theory, but among all CS conferences, where the audience gets a chance to see a collection of fantastic works that appeared in a variety of venues. What other event brings together a talk about routing algorithms in data centers and category theory? Or a talk about mechanism design for school choice and the local minima landscape for machine learning?

The talks this year will be the following (in order of presentations – the sessions will be Monday-Thursday at 4 or 4:10pm, the STOC best papers and Lydia Kavraki’s Athena lecture will also be given in these sessions):

*Sanjoy Dasgupta***:**(Science)**A Neural Algorithm for Similarity Search***Monia Ghobadi:*(SIGCOMM 2016)**Programming the Topology of Networks: Technology and Algorithms***Irene Lo:*(MATCH-UP 2017)**Dynamic Matching in School Choice:Efficient Seat Reassignment after Late Cancellations***Vyes Sekar:*(SIGCOMM 2016)**Rethinking Network Flow Monitoring with UnivMon***Rong Ge:*(NIPS 2016 and ICML 2017)**Optimization Landscape for Matrix Completion***Sergey Bravyi:***Quantum Advantage with Shallow Circuits**(QIP 2018)*Sam Staton:*(LICS 2017)**Higher-Order Probability Theory***Tengyu Ma:*(ICML 2017)**Generalization and Equilibrium in Generative Adversarial Nets (GANs)***Dan Suciu:***Optimal Query Processing meets Information Theory**(PODS 2016)

*[Jarek gave a great informal exposition of this paper on Friday, and I asked him to make it into a blog post. My only knowledge of the paper is from Jarek’s explanations: my main contribution to this post was to delete some details and insert some inaccuracies –Boaz.]*

A recent exciting paper of Ran Raz and Avishay Tal solved a longstanding problem in quantum complexity theory: the existence of an oracle under which the class is not contained in the polynomial hierarchy. The proof is elegant and actually not extremely complicated, and in this blog post we discuss the main idea behind it. For some of the motivation and background of this problem, see the posts of Scott Aaronson and Lance Fortnow. As is actually often the case in such relativization result, the proof does not have much to do with either or the polynomial hierarchy. Rather at its heart is a new lower bound against the perennial “victim” of circuit lower bounds: the class of constant depth circuits with AND, OR and NOT gates.

Rather than thinking about a single choice of an oracle, it turns out that it’s better to think of a *distribution* over oracles, and (for notational convenience) as a distribution over *pairs* of oracles .

They show that:

- There is a quantum algorithm that can make one quantum query to each of these oracles, and using this distinguish with advantage at least between the case that and are sampled from , and the case that and are sampled from the uniform distribution. (Set for concreteness.)
- For every constant depth polynomial size circuit on inputs (where ), the probability that outputs on sampled from differs by at most from the probability it outputs on sampled from the uniform distribution.

Together 1 and 2 imply the main results by standard arguments.

The distribution they use is (a variant of) the “Forrelation” distribution proposed for this task by Aaronson. Namely, they choose to be correlated with the Fourier transform of .

Specifically however, it is better to think about this distribution as follows:

- Sample as correlated Gaussian random variables over with the covariance matrix where is the Hadamard matrix defined as for every . Another way to think about this is that is a standard normal, and . Note that for any , .
- Then let be the distribution over that has the expectation . That is, for , we choose as a random variable with expectation and as a random variable with expectation . If or are larger than and each coordinate is a standard Normal then we truncate them appropriately, but because we set this will only happen with probability which we can treat as negligible and ignore this event from now on.

The heart of the proof is in the classical part, and so we will merely sketch why this problem is easy for quantum algorithms. A Quantum Algorithm that gets query access to and , which we think of as functions mapping to can obtain the state

The algorithm can then perform the Quantum Fourier Transform (in this case applying Hadamard to its qubits ) to change this to the state

But in our case, will be correlated with and so this can be shown to have the form

where is some state on the last qubits.

If we now perform the Hadamard operation on the first qubit and measure it, we will be about more likely to see a zero than to see a 1. On the other hand it is not hard to show that if the input was uniform then we will see zero and one with equal probability. The advantage of can be amplified by making more queries or considering “tensored” forms of the same oracles.

We now come to the heart of the proof: showing that it is *hard* to distinguish from the uniform distribution for circuits. One of the challenges of obtaining such an oracle separation is that the standard approach of showing such a result, which is to show that fools low degree polynomials, is not available to us. Indeed, our quantum algorithm itself is a degree 2 polyonomial in and !

At a high level, the proof shows that AC0 circuits can be approximated (to a certain extent) by very special kind of low degree polynomials – ones that are *sparse* in the sense that the L1 norm of their coefficients is not much larger than the L2 norm.

Anyway, we are getting ahead of ourselves. Let be computable by a polynomial size constant depth circuit. Our goal is to show the following result:

**Main Theorem:**

We will identify such a function with its *multilinear* extention, and hence think of as the polynomial . Note that in this notation, .

We start with the following simple observation: for every multilinear map , and , if we sample from a product distribution where the expectation of each coordinate matches the coordinate in , then . This means that for the purposes of analyzing the main theorem, we could just as well imagine that is sampled from the normal multivariate distribution over with correlation matrix .

Hence the result will follow from the following result:

**Main Lemma** Let be any multivariate Gaussian distribution over such that and for . Then is pseudorandom w.r.t. multilinear polynomials computable by polynomial sized circuits of constant depth.

The notation hides factors poly-logarithmic in . In our case, the distribution satisfies , and and so we get that (and hence our original ) is pseudorandom, completing the proof of the main theorem.

Note that the Main Lemma is of interest in its own right – it says that for a multivariate Gaussian to fool AC0, it is enough that the second moments sufficiently decay as long as the diagonal entries are moderately bounded away from 1. (However, for multivariate Gaussians, a bound on the second moments also implies decay of higher moments by Isserlis’ theorem.)

The Main Lemma is proved by a “Hybrid Argument”. (Related to arguments used by Chattopadhyay, Hatami, Hosseini and Lovett for constructing pseudorandom generators.) We will set . Since a sum of Gaussians is a Gaussian, for every multivariate Gaussian , we can write

where is identically distributed to .

We write . To prove the Main Lemma we need to show that for every constant-depth poly-size computable multilinear . To do so we will establish the following inequality:

for every . (Writing for the vector.)

We can obtain the main lemma by summing up over all .

The main tool we will use to establish is the following structural result on AC0, which is again of independent interest:

**L1 bound theorem:** Let be a function computable by an circuit of quasipolynomial size and constant depth, and let and . Then the function defined as satisfies that

for every .

For those bounds on Fourier coefficients were proven by Avishay Tal. It turns out that general case can be essentially reduced to this case by a clever random restriction.

The L1 bound theorem implies , by writing for sampled from . (The event holds with high probability. It can be shown that conditioning on this event changes the relevant expectations in by at most an additive term that is proportional to the failure probability, which is smaller than .)

We can then bound the lefthand side of , by

the moment for vanishes. In the moments arising from , the term makes the contribution negligible, even when we sum it over terms — by Isserlis’ theorem we mentioned above we can bound . The only interesting term is when where we get exactly as desired. (As an aside, Isserlis’ Theorem gives a way to compute the moments of any multivariate Gaussian, and has a beautiful two line proof using Taylor expansion of in the formal variables .)

We now show the proof of the L1 bound theorem. We say that function is a restriction of , if where .

**Claim:** For every fixed and multilinear there exist a random which is restriction of , such that

Indeed, for given , take a variable , such that on each coordinate independently we have with probability , with probability , and otherwise. Note that . Since is multilinear, and is independent across coordinates, we have

where depending on realization of is a corresponding restriction of — it fixes coordinates where accordingly.

Now the L1 bound theorem follows by linearity of Fourier transform: for specific we have . Every realization of as a restriction of an circuit is again an circuit (both size and depth can only decrease), hence we can apply the bounds on Fourier coeffiecients of circuits to , namely . Plugging those together we get

which is what we needed to prove.

]]>

I am now teaching a large course at Harvard with a staff of about 18 people. I use slack for communication with the staff, Piazza for communication with the students, Gradescope for submitting student work and grading exams, while the course data is kept in a private GitHub repository (see below).

These days I use markdown for almost everything. For me it is basically plaintext where I can also write math between dollar signs. I use markdown to write my all my lecture notes including my 500+ pages intro to TCS book. I use it to write short technical notes and emails: in particular the Markdown Here extension lets me write math in gmail, and Hackmd is a good way to share short notes. Finally, I am also using it to produce presentations for my courses. (For one-off presentations such as talks, I still prefer PowerPoint.)

For much of these I use pandoc to transform markdown into various formats, including latex/pdf, html, and html slides. I still use Atom as my main editor, and it’s only become better over time. I use the sync-setting package to synchronize settings between machines and autosave to avoid losing my work. It took me forever to figure out how to enable spell checking using the spell-check package on Windows. The trick is to enable the `SPELLCHECKER_PREFER_HUNSPELL`

environment variable.

Two apps that save me a ton of time are Breevy (a text expander) and Ditto (a clipboard manager). A *clipboard manager* simply remembers all the things you copied. For example, if you are filling out a recommendation or review form and need to copy stuff from your text file, you can copy all the fields first and then paste them one by one. A *text expander* allows you to expand short pieces of text into longer ones. For example, these days I am co director of undergraduate studies at Harvard and find often the need to write the same email again and again. For example, if I type `:dusadmit`

in any window, it automatically expands to

Dear X,

Thanks for your email and your interest in Harvard. For information about admission to the Harvard undergraduate program, please see the admissions website, https://college.harvard.edu/admissions, which has information about the application process and requirements. Some additional information for international students is at https://college.harvard.edu/admissions/application-process/international-applicants.

Information for prospective undergrads interested in the School Engineering and Applied Sciences (SEAS) is at https://www.seas.harvard.edu/audiences/prospective-undergraduates, and some information about Computer Science is at https://www.seas.harvard.edu/programs/computer-science and https://cs50.harvard.edu/guide.

Please note that the Computer Science Directors of Undergraduate Studies are not directly involved in the Harvard admissions process, and so questions about admissions should be directed to the admissions office.

Best and good luck,

Boaz Barak

Gordon McKay Professor of Computer Science

John A. Paulson School of Engineering and Applied Sciences

Harvard University

The text expander even stops to ask me for the name so `X`

above is replaced with it. This way I answer with pertinent and helpful information without spending all my time typing. For mass personalized emails, I use the Mail Merge for Gmail add on.

I also use the text expander for various other goals. For example, no matter in which program I am, if I type `\bits`

it always expands to `\{0,1\}`

. (Indeed I find myself using less and less custom latex commands these days.) I also use it to automate my workflow with git, which is still my favorite version control system. If I type `:gitsub`

it gets expanded to `git submodule foreach 'git add -A ; git commit -m wip ; git pull --rebase ; git push'; git add -A ; git commit -m wip ; git pull --rebase ; git push`

which basically updates my repository in full. I use it also to remember folder names: if I type `:cs121`

then it expands to `C:\DBOX\WORK\Courses\CS121\`

As mentioned above, I use git for keeping track of papers, courses, and many other projects. You can host git repositories on GitHub, Bit Bucket, or Gitlab, and there are other services as well. I like to put my git repositories inside my dropbox folder even though you are not supposed to do that. The way I handle it is that I have for each project two subfolders `Laptop`

and `Desktop`

that contain two copies of the same repository. I only use `Laptop`

on my laptop and `Desktop`

on my desktop and so I never touch the same folder from two computers. So far it seems to work.

Netlify is a great service that allows you to keep a website updated from a repository. It works well together with Hugo, which is a website generator that allows you to write your website using.. you guessed it .. markdown.

These days I find myself in the command line a lot for compiling markdown, pushing things to git etc.., cmder is a nice console app for Windows. Sumatra is still my favorite pdf viewer. For drawing quick doodles using the surface pen, the sketchpad app that comes with Windows (“Windows Ink Workspace”) is actually quite nice. I sometimes need to write some code snippets – anaconda and jupyter lab are really nice (and not windows specific). And of course I use MikTex.

Anyway, that’s it about my technology use.. would be curious to learn what other people are using to save time and effort.

]]>

Teaching it was a great fun because of the fantastic group of students that took the course. They were genuinely interested in cryptography, and kept asking me extremely interesting questions and had excellent insights. Thanks also to Yael Kalai and Bruce Schneier for giving guest lectures in this course.

Though I haven’t done so in a while, I decided to do course projects this term. I was very impressed on what these students (most of whom are undergraduates) managed to achieve in one term. So, since a blog is not much use if you can’t show off your students, without further ado here are the projects in the course:

- A Cryptanalysis of IOTA’s Curl Hash Function / Michael Colavita and Garrett Tanzer. The IOTA crypto currency initially used a tailor made hash function known as “Curl” or “Curl-P”. Unfortunately, a collision was found by Heilman et al. However, Heilman did not make their attack code available. In this project, Colavita and Tanzer managed to reproduce their result, and provide a complete description, as well as online available code for the collision-finding algorithm. Even if you are not interested in IOTA, reading this project is a great way to get familiar with some cryptanalytic techniques. Like any good cryptanalytic work, they even got flamed on Reddit.

- DeCERT: A Decentralized Certificate Authority / Leo Hentschker. This project describes what is a novel (to me at least) approach to deal with the issue of trust in certificates: decentralize the decision of what certificate to trust by voting on a block chain. Leo built the website http://decert.io/ where one can issue a certificate as well as voting on it using tokens based on the Ethereum currency. This was much more than I could have expected from someone to do as a course project!

- OpAwesome: The Good, the Bad, and the Fuzzy in the Secure Database Landscape: Sophie Hilgard and Wilson Qin. The project does a thorough exploration of the design space for encrypted data bases. It seems that every few months we hear of another break where millions of people’s personal information was compromised, and a theoretical cryptographer might wonder why don’t people simply encrypt all the databases with fully homomorphic encryption? Of course, even I know that this is not yet realistic, and generally there are tradeoffs between the levels of security and efficiency (and not less important, backward compatibility with legacy systems). The project covers the various solutions in this space, including some that are relatively easy to use as a “plug in replacement” for non encrypted data bases, but whose security is also limited, and others that have stronger notions of security. They also discuss their preliminary work on
`opAwsome`

: a secure database that allows for smoother tradeoffs between communication and client-side computation on the one hand, and security on the other hand.

- Fully Homomorphic Image Processing in the Cloud: William Fu, Raymond Lin, Daniel Inge. Fully homomorphic encryption is notoriously ineffieicnt, but
*partially homomorphic encryption*might sometimes be good enough. This project implements homomorphic compression of images. This can be useful for example to run a machine learning classifier on an image (such classifiers often work on compressed images). I was surpised this can actually be done, but they even have code (using Microsoft’s SEAL library) available on this repository.

- Introduction to Encrypted Voting Protocols / Eric Bornstein:
*Voting*underlies many of the most important decisions we as a society have to undertake. The importance of the security of (and public confidence in) voting procedures cannot be overstated, and yet many current implementations leave something to be desired. This comprehensive survey discusses the differnt goals we wish voting protocols to achieve, including accuracy, privacy, and freedom from coercion, and various cryptographic, as well as paper based (and combinations thereof) ways to achieve them. In particular it covers the ideas of*Mix Nets*(for anonymity),*Visual Cryptography*(for usablity) , and*Vote/Anti-vote pairs*(for receipt freeness). If you have any interest in voting, you’d probably want to take a look.

- Proof of sequential work / Meena Jagadeesan, Alec Sun, and Alex Wei:
*Proofs of work*have recently been of great interest to cryptographers, as they underlie the Bitcoin blockchain. However, the type of proofs used by Bitcoin and its ilk are inherently parallelizable, which gives some limits to their usability. For example, one cannot use them to measure elapsed time- only total cost.*Proofs of sequential work*adress this by providing a “puzzle” that can only be solved by $N$ sequential steps. The project surveys some of the older constructions as well as the very recent one of Cohen and Pietrzak, and discusses ideas for addressing some of the open problems in this area.

- Analysis of vulnerabilities and mitigations in Multiparty Computation / Leor Fishman: This project analyzed some subtle but important issues in multiparty secure computation. One of the interesting challenges in the area of multiparty computation is the idea of
*covert computation*: how can you hide not just*what*is being computed but even the mere fact that*something*is being computed? The project considers a strong (but well motivated!) notion of security where even some of the parties to the protocol would not know the identity of others, and shows some negative results. It then also studies issues of communication patterns in MPC, and in particular protocols that divide the communications into smaller groups or “quorums”. The project shows that the effect of such optimization on security is stronger than what one might have thought initially.

- A secure zero knowledge two factor authentication protocol / Albert Chalom, Nathan Wolfe: This project studies how one achieves the intersection of several goals in authentication: handling
*fuzzy*(and potentially low entropy) authentication data, such as fingerprint etc.., but while also making sure that (as is done with passwords) this data is not stored in the clear in the server. They show a “proof of concept” using two party secure computation.

- Shor’s Algorithm: Andrei Laurentiu Ciupan. Shor’s algorithm is of course a fundamental result that has energized much research on both quantum computing and cryptography over the last two decades. This project is a short (8 page) but almost complete (except for some results on continued fractions) description of the algorithm and its analysis.

- Query and Communication Lower Bounds for Key-Exchange Protocols: a Survey / Noah Golowich: This project touches a topic that’s dear to my heart – understanding the possiblity of implementing key exchange based only on an idealized random oracle, without making hardness of assumptions on
*structured*problems such as lattices, factoring, etc.. It surveys some of this highly technical literature, and mentions a very recent result on communication tradeoffs and some open problems.

- A survey on quantum secure cryptographic systems / Tomoka Kan: This projects the landscape of cryptographic schemes whose security was analyzed in the setting where not only
*computation*but also*communication*is quantum. This setting allows the adversary some extra powers, as they can make “super position queries” to the other parties, and the issue of security is quite subtle and interesting.

You can use the issues page on the GitHub repository to make any suggestions, corrections, reports on bugs and typos, and so on.

I also hope that others instructors would consider using this text in their courses, and would welcome suggestions as to what I can do to help in that. In particular, over the summer I plan to add more exercises (including some solved ones) and more examples, as well as more explanations and “proof ideas” for many of the theorems.

The main differences between this text and “classical approaches” to intro theory courses such as Sipser’s are the following:

**Circuits / straightline programs as the basic model instead of automata**

I do not start with *finite automata* as the basic computational model, but rather with *straight-line programs* (or, equivalently *Boolean circuits*) in an extremely simple programming language which I call the “NAND programming language” since its only operation is assigning to one variable the NAND of two others.

Automata are discussed later in the course, after Turing machines and undecidability, as an example for a restricted computational model where problems such as halting are effectively solvable. This actually corresponds to the historical ordering: Boolean algebra goes back to Boole’s work in the 1850’s, Turing machines and undecidability were of course discovered in the 1930’s, while finite automata were introduced in the 1943 work of McCulloch and Pitts but only really understood in the seminal 1959 work of Rabin and Scott. More importantly, the main current practical motivations for restricted models such as regular and context free languages (whether it is for parsing, for analyzing liveness and safety, or even for routing on software defined networks) are precisely because these are tractable models in which semantic questions can be effectively answered. This motivation can be better appreciated *after* students see the undecidability of semantic properties of general computing models.

Moreover, the Boolean circuit / straightline programs model is extremely simple to both describe and analyze, and some of the main lessons of the theory of computation, including the notions of the *duality between code and data*, and the idea of *universality*, can already be seen in this context. Nonuniform computation also gives essential background for some exciting topics I like to cover later in the course. For example, cryptography (making sense of notions such as “256 bits of security” or difficulty of bitcoin mining), pseudorandom generators and the conjecture that BPP=P, and quantum computing. Circuits of course also yield the simplest proof of the Cook Levin theorem.

**A programming language instead of Turing machines for modeling uniform computation**

Instead of Turing Machines, I introduce uniform computation using an equivalent model obtained by extending the straightline NAND programming language mentioned above to include loops and arrays (I call the resulting programming language “NAND++”). I do define Turing machines and show the equivalence to NAND++ program, so the students can connect this both to the historical context as well to other literature they may encounter.

I believe that the programming language formalism makes this model more concrete and familiar to the students than Turing machines. For examples, a result such as the equivalence of Turing machines with two dimensional tapes and one dimensional tapes is now described as showing that a language with two dimensional arrays can be “transpiled” into a language with one dimensional arrays only (i.e., two dimensional arrays can be implemented via “syntactic sugar”).

Moreover, because the NAND++ programming language is extremely simple and fully specified, it is easy to show its formal equivalence with TM’s. In fact, since NAND++ is essentially obtained by adding a “feedback loop” to Boolean circuits, this makes the Cook Levin theorem easier to prove. I even implemented the Cook Levin reduction in a Python notebook, and so students can see how one can transform a NAND++ program into a graph and a number such that has a cut of size if and only if the program had an input that makes it output . Alas, these graphs tend to be quite large:

I also introduce yet another extension of NAND++ that allows indirect access to arrays, and hence is essentially equivalent to the standard RAM machine model used (implicitly) in algorithms courses. (I call this language NAND<<.) Using a RAM based model makes the distinction between notions such as and time more meaningful, and makes the time complexity classes correspond to the informal definitions of linear and quadratic time that students encountered in their algorithms lectures (or their whiteboard coding interviews..).

**More advanced topics**

Reducing the time dedicated to automata (and eliminating context free languages, though I do touch on them in the text in an optional section) allows to spend more time on topics such randomness and computation, the interactions between *proofs* and *programs* (including Gödel’s incompleteness, interactive proof systems, and even a bit on the -calculus and the Curry-Howard correspondence), cryptography, and quantum computing.

The book is still work in progress, but I think is already quite usable as a textbook. Indeed, much of the feedback I got on the course was that people were happy with the text (though perhaps they merely preferred it to the lectures due to my teaching abilities ). One student even got me this coffee mug:

In any case, I hope to get more comments as I work on it over the summer. Regardless of whether or not it’s eventually published as a book, I intend to keep a version of it freely available online.

]]>

Correspondingly, the number of years people spend in school has been growing:

Most people would think this is a good thing. First of all, an educated population is a good in itself. Second, this seems correlated with growth in general:

However, Caplan believes education in high school and above is mostly a waste of time and money. (He is also suspicious about K-8 education, which he calls a “daycare center for kids”.) So in his mind, the GDP growth of advanced countries has been **in spite** of their investment in education. Who knows how far the U.S. could have come if the average years in school would have stayed at 8 or 10.

Caplan’s theory is that people’s productivity is just a function of innate skill, that is not changed at all by education. They only go to high school, college, and beyond due to an “arms race” of credentials aided by ease of access to education. If you own a business and believe Caplan’s theory, you’d probably want to set up shop as far away from universities as possible, where you could get quality workers on the cheap that did not waste time on education. Perhaps Amazon should hire Caplan as a consultant, they seem to be doing the exact opposite.

Caplan also wrote a book. Scott Aaronson is not impressed.

]]>