# On the importance of the alphabet

In my last post, we saw that the problem of learning juntas, hard as it is over Boolean inputs, seems even worse over other alphabets. Coding theory happens to have a inexhaustible supply of such problems. Some of these are long-standing open problems, others are of a more recent vintage. More of these problems seem to crop up all the time. And as will be clear from the list below, this issue does seem to follow me around.

Here are four of my favorite problems, completely solved for $q =2$ but wide open for any other alphabet (which we take to be some finite field).

Optimal codes of distance d: Arguably the mother of all such problems. How many parity checks must a $q-$ary length $n$ code of distance $d$ have? Here we think of $d$ as a constant and $n$ going to infinity. For $q =2$ the Hamming bound tells us that $d \log n /2$ parity checks are necessary. The BCH construction tells us that this is also sufficient. For larger alphabets, the best lower bound is still the Hamming bound, and the Hamming bound is still $d \log n/2$. Alas the BCH upper bound is now $(1 - \frac{1}{q})d \log n$. For more on this problem and its connections to additive combinatorics, see here.

List-decoding Reed-Muller codes: Given a function $f : \mathbb{F}_q^n \rightarrow \mathbb{F}_q$, our goal is to recover all degree-d polynomials that are close to $f$. Of course, we can hope to solve this in reasonable time only if this list is not too large (say a constant independent of the dimension $n$). The question is what bound on the distance from $f$ (equivalently the error-rate) is enough to ensure that the list is small,  the conjecture is that it is the minimum distance of the degree d Reed-Muller code. The degree 1 case is the problem of Hadamard decoding, solved by Goldreich and Levin. In joint work with Klivans and Zuckerman, we solved the $q =2$ case for all $d$, and in a subsequent paper, the $d =2$ case for all $q$. The case of larger $q, d$ is open.

The bottleneck in working with higher alphabets is the following. Fix a polynomial P which has good agreement.  The Goldreich-Levin and GKZ algorithms “guess” the P on a  c-dimensional subspace S, where $c$ is a constant. They then use this to recover  P on all $c +1$ dimensional subspaces containing S. Why is this easier? Knowing the right values on a subspace of co-dimension 1 effectively reduces the error-rate by a factor of $1 - 1/q$. For $q =2$, this reduces the error by $1/2$ and puts us in the unique-decoding regime where things are easy. But for larger $q$, this error-reduction does not always suffice to specify P unambiguously.

Decoding from errors and erasures: Assume that we have an $[n,k,d]_q$ linear code. Such a code can correct $(d-1)/2$ errors and $d -1$ erasures. Assume that we have efficient algorithms for both these (the latter is always the case for a linear code). Can we combine these in a black-box fashion to get an efficient algorithm that corrects $e$ errors and $f$ erasures as long as $2e + f \leq d-1$? When $q =2$, the answer is Yes (take a guess!). I don’t know the answer for larger $q$. I came upon this problem in conversations with Atri Rudra,  perhaps there is a reference for it? Or better still, a solution?

Learning Linear Functions with noise: Learning parity with noise is a central problem in learning, that is widely believed to be hard. We are given random examples of points in $F_2^n$ labeled with some unknown parity function L, the catch is that each label is flipped independently with probability $0.1$. A seemingly even harder problem is when the noise is adversarial, and the adversary is allowed to flip an arbitrary $0.1$ fraction of labels over all points in $\mathbb{F}_2^n$. In joint work with Feldman, Khot and Ponnuswami, we show that the problems are in fact equivalent, there is a reduction from adversarial noise to random noise.

In retrospect, perhaps this is not too surprising: learning parity with noise is already so very hard that making the noise adversarial does not make it worse. Well, the same philosophy ought to apply to other alphabets. But I don’t know of a similar reduction for any other alphabet. This paper makes some progress on this problem, reducing adversarial noise to a more benign noise model, but does not solve the problem fully.

If you have a problem of this flavor, I’d love to hear about it. I am unlikely have anything insightful to say, but I’d be happy to commiserate with you about how annoying $q =3$ can be.