Guest post by Boaz Barak and Zvika Brakerski (part 2)

In the previous post, we demonstrated the versatility of fully homomorphic encryption and its applicability for multiple applications. In this post we will demonstrate (not too painfully, we hope) how fully homomorphic encryption is constructed. Our goal is to present the simplest solution that (we could think of and) works. We will not insist on achieving the best parameters or security guarantees, however the principles we describe are qualitatively the “state of the art” in FHE construction.

Let us get to work, then! We already explained in the previous post that in order to achieve full homomorphism, it is sufficient to homomorphically evaluate a universal set of gates. The universal set we will use is ${\{ XOR, AND \}}$, also known as addition and multiplication over ${GF(2)}$. (This is the “universal set of choice” for almost all known schemes.) This will allow us to evaluate any efficiently computable function by representing it as an arithmetic circuit over ${GF(2)}$ and performing homomorphic evaluation gate by gate. From this point and on, almost until the end of this post, we will thus forget about circuits and focus on homomorphic addition and multiplication modulo ${2}$.

Encryption and Decryption

You would think that a homomorphic encryption scheme must be incredibly complicated, but in fact it is extremely simple. The secret key is a random binary vector ${{\mathbf{s}} \in \{0,1\}^n}$ and to encrypt a bit ${\sigma \in \{0,1\}}$ we output a random real vector ${{\mathbf{c}} \in (-1,1]^n}$ satisfying

$\displaystyle {\mathbf{c}} \cdot {\mathbf{s}} \pmod{2} =_{\epsilon} \sigma \;. \ \ \ \ \ (1)$

Here ${{\mathbf{c}}\cdot {\mathbf{s}}}$ is the dot product ${\sum_{i=1}^n c_is_i}$, we write ${x =_{\epsilon} y}$ for ${|x-y|\leq \epsilon}$ and ${x \pmod{2}}$ denotes the number ${y\in (-1,1]}$ such that ${x=y+2I}$ for some integer ${I}$ (e.g., ${4.2 \pmod{2} = 0.2}$ and ${5.7 \pmod{2} = -0.3}$). The numbers ${n,\epsilon}$ are parameters of the system and are related to its security. Of course we won’t actually work with real numbers, but truncate them to some finite precision (this all works out as you would expect).

Given ${{\mathbf{s}}}$, it is not hard to generate such a random vector ${{\mathbf{c}}}$, and of course as long as ${\epsilon<1/2}$, Equation (1) also yields a decryption algorithm (just compute the dot product and round the answer). You may have noticed that we need the key ${{\mathbf{s}}}$ for both encryption and decryption. We thus get a symmetric key encryption scheme, rather than a public key scheme (where the encryption can be done using a different, public, key). However, this is not an issue since a homomorphic symmetric scheme implies a public key scheme with similar parameters. Therefore we can conveniently focus on the symmetric case.

Is this scheme secure?

That is an excellent question. For example, observe that if ${\epsilon}$ is equal to ${0}$, then this scheme is not secure. Indeed in this case, given sufficiently many encryptions of ${0}$, the secret key can be efficiently restored by simple Gaussian elimination. However, Gaussian elimination is a very “brittle” algorithm and if we inject even a relatively small amount of noise (e.g., set ${\epsilon = n^{-polylog(n)}}$ or ${\epsilon=2^{-\sqrt{n}}}$) then there is no known attack. In fact, the difficulty of solving noisy linear equations (in various settings and variants) underlies many cryptographic schemes as well as the famous learning parity with noise problem and its large alphabet cousin “learning with errors”. The scheme above is secure assuming a variant of the learning with errors problem, and so if you can break it, we (and a bunch of other people) will be happy to know!

We almost forgot that we are here to talk about homomorphism- lets try to see why our suggested scheme is homomorphic. Recall that for homomorphism, we are given two cyphertexts ${{\mathbf{c}}}$ and ${{\mathbf{c}}'}$ encrypting ${\sigma}$ and ${\sigma'}$ respectively and need to compute ciphertexts ${{\mathbf{c}}_{add}}$ and ${{\mathbf{c}}_{mult}}$ that encrypt ${\sigma + \sigma' \pmod{2}}$ and ${\sigma \sigma' \pmod{2}}$ respectively. The caveat is that of course we must perform these operations without access to the secret key.

Addition is easy. We can simply set ${{\mathbf{c}}_{add} = {\mathbf{c}} + {\mathbf{c}}' \pmod{2}}$. Now from Equation (1) we know that

$\displaystyle \begin{array}{rcl} {\mathbf{c}} \cdot {\mathbf{s}} &= \sigma + e + 2I \\ {\mathbf{c}}' \cdot {\mathbf{s}} &= \sigma' + e' + 2I' \end{array}$

where ${|e|,|e'| \leq \epsilon}$ and ${I,I'}$ are integers. So, if we add them together we’ll get that

$\displaystyle ({\mathbf{c}} + {\mathbf{c}}') \cdot {\mathbf{s}} = \sigma + \sigma' + (e+e') + 2(I+I') \;,$

or in other words,

$\displaystyle {\mathbf{c}}_{add} \cdot {\mathbf{s}} \pmod{2} =_{2\epsilon} \sigma+\sigma' \pmod{2} \;.$

That is, ${{\mathbf{c}}_{add}}$ is indeed an encryption of ${\sigma+\sigma' \pmod{2}}$ (with twice as much noise, but that’s not too bad).

Multiplication is more tricky. To get some intuition, let us start by ignoring that we work mod ${2}$ and have noise, and pretend that the given ciphertexts ${{\mathbf{c}}}$ and ${{\mathbf{c}}'}$ simply satisfy ${{\mathbf{c}} \cdot {\mathbf{s}} = \sigma}$ and ${{\mathbf{c}}' \cdot {\mathbf{s}} = \sigma'}$. Even then it’s not at all clear how we would find a ciphertext ${{\mathbf{c}}_{mult}}$ satisfying ${{\mathbf{c}}_{mult} \cdot {\mathbf{s}} = \sigma\sigma'}$. However, we do know one way to express ${\sigma\sigma'}$ in terms of ${{\mathbf{c}},{\mathbf{c}}'}$ and ${{\mathbf{s}}}$. Namely we have the equation

$\displaystyle \sum_{i,j} c_ic'_j s_is_j = ({\mathbf{c}} \cdot {\mathbf{s}})({\mathbf{c}}' \cdot {\mathbf{s}}) = \sigma\sigma' \;. \ \ \ \ \ (2)$

How can we use Equation (2) to create ${{\mathbf{c}}_{mult}}$? At this point we recall that even though the homomorphic operation cannot use the secret key, we are allowed to provide it with additional auxiliary information, so long as this information doesn’t violate security. (This is not cheating since the auxiliary information can either be a part of some public parameters/key, or simply concatenated to every encryption.) Specifically, suppose that we publish, a list of ${n^2}$ encryptions ${{\mathbf{x}}^{i,j}}$ of the bits ${\{ s_is_j \}_{i,j\in [n]}}$. Since these are encryptions, security is unharmed by making them public (more on this point later). However, if we keep to our blissful ignorance of noise and modulo factors, this means we can replace in Equation (2) the factors ${s_is_j}$ with ${{\mathbf{x}}^{i,j} \cdot {\mathbf{s}}}$, obtaining

$\displaystyle \sum_{i,j} c_ic'_j ({\mathbf{x}}^{i,j} \cdot {\mathbf{s}}) = (\sum_{i,j} c_ic'_j {\mathbf{x}}^{i,j})\cdot {\mathbf{s}} = \sigma\sigma' \;. \ \ \ \ \ (3)$

Thus, we can set ${{\mathbf{c}}_{mult} = \sum_{i,j} c_ic'_j {\mathbf{x}}^{i,j}}$ to obtain an encryption of ${\sigma\sigma'}$.

Now let’s see what happens when we consider the modulo and noise factors. Equation (2) changes to

$\displaystyle \sum_{i,j}c_ic'_j s_is_j = (\sigma + 2I+e)(\sigma' + 2I' + e') = \sigma\sigma' +2(I+I')+ O(\epsilon n) \;, \ \ \ \ \ (4)$

since (given that ${{\mathbf{c}},{\mathbf{c}} \in (-1,1]^n}$) we know that ${|I|,|I'| \leq n}$. This is not so bad, since we assume ${\epsilon}$ is really tiny, and in particular much smaller than ${1/n}$. However, now we need to consider these factors also in the encryptions ${{\mathbf{x}}^{i,j}}$. Namely, our guarantee will now be that ${{\mathbf{x}}^{i,j} \cdot {\mathbf{s}} = s_is_j + e^{i,j} + 2I^{i,j}}$ for some ${|e^{i,j}|\leq \epsilon}$ and integer ${I^{i,j}}$. Thus

$\displaystyle {\mathbf{c}}_{mult} \cdot {\mathbf{s}} = \sum_{i,j}c_ic'_j ({\mathbf{x}}^{i,j}\cdot {\mathbf{s}}) = \sum_{i,j} c_ic'_j s_is_j + \sum_{i,j} c_ic_j e^{i,j} + 2\sum_{i,j} c_ic'_j I^{i,j} =$

$\displaystyle = \sigma\sigma' + 2(I+I') + O(\epsilon n^2) + 2\sum_{i,j} c_ic'_j I^{i,j}$

Thus we would be done if ${\sum c_ic'_j I^{i,j}}$ was an integer. Unfortunately, we have no such guarantee, since ${{\mathbf{c}},{\mathbf{c}}'}$ are real vectors whose coordinates need not be integer. However, there is still something we can do. We consider the bit representation of the number ${c_ic'_j}$. That is, let us write

$\displaystyle c_ic'_j = \sum_{k\geq 1} c^{i,j,k}2^{-k}$

where the numbers ${c^{i,j,k}}$ are in ${\{0,1\}}$. We’ll stop at ${k=n}$, which is more than enough precision for our needs. Now we’ll publish encryptions ${{\mathbf{x}}^{i,j,k}}$ of the value ${s_is_j/2^k}$ (this is not a ${0/1}$ value, but our encryption is just as secure for encrypting arbitrary real numbers in ${(-1,1]}$ as for encrypting bits). Our ciphertext ${{\mathbf{c}}_{mult}}$ will be defined as

$\displaystyle {\mathbf{c}}_{mult} = \sum_{i,j,k} c^{i,j,k}{\mathbf{x}}^{i,j,k} \;.$

Clearly, ${{\mathbf{c}}_{mult}}$ can be computed from ${{\mathbf{c}},{\mathbf{c}}'}$ and the encryptions ${\{ {\mathbf{x}}^{i,j,k} \}}$. Now, writing ${{\mathbf{x}}^{i,j,k}\cdot {\mathbf{s}} = s_is_j/2^k + e^{i,j,k} + 2I^{i,j,k}}$, we can do a similar calculation as before to show that

$\displaystyle {\mathbf{c}}_{mult} \cdot {\mathbf{s}} = \sum_{i,j,k} (c^{i,j,k} s_is_j/2^k + e^{i,j,k} + 2I^{i,j,k}) = \sum_{i,j}c_ic'_j s_is_j + O(\epsilon n^3) +2(\sum_{i,j,k} I^{i,j,k}) =$

$\displaystyle \sigma\sigma' + O(\epsilon n^3) + 2I''$

for some integer ${I''}$. Thus ${{\mathbf{c}}_{mult}}$ is indeed an encryption of ${\sigma\sigma'}$, albeit with higher noise level of ${O(\epsilon n^3)}$.

Evaluating bounded depth circuits

So far we showed how to evaluate addition and multiplication, and get back to a ciphertext that looks the same as the one we started with. The only thing that should concern us is the growth of the noise in the process (recall that the noise magnitude has to stay under ${1/2}$ to guarantee correct decryption). We start with tiny noise of magnitude ${\epsilon}$, but with each homomorphic operation, the noise grows by a factor of at most ${O(n^3)}$ (a more careful look will show that the actual factor is actually ${O(n)}$, however ${O(n^3)}$ suffices for our purposes). We now recall that our final goal is to evaluate a generic function that is represented by an arithmetic circuit over ${GF(2)}$. If the circuit has depth ${d}$, then the evaluation process increases the noise by a factor of ${n^{O(d)}}$. Therefore, in order to correctly decrypt, we must choose ${\epsilon}$ so that ${\epsilon \cdot n^{O(d)} < 1/2}$. Recall that we can allow ${\epsilon}$ to be sub-exponential, say ${\epsilon = 2^{-\sqrt{n}}}$, and so we can correctly evaluate circuits of depth up to ${\Tilde{\Omega}(\sqrt{n})}$.

To summarize, we did not achieve fully homomorphic encryption yet, but we did show how to evaluate any arithmetic circuit of depth at most ${\sqrt{n}}$. This could be sufficient if one wants to evaluate fairly shallow circuits. In fact, if the depth ${d}$ of the circuit is known ahead of time, we can choose ${n}$ large enough to allow homomorphism of the desired circuit (this only requires ${n}$ polynomial in ${d}$). The latter notion is sometimes called leveled-FHE.

Bootstrapping and Full Homomorphism.

Finally, we now show how to achieve “real” fully homomorphic encryption using Gentry’s bootstrapping technique. Specifically, we will show a method to reduce the noise level of ciphertexts. Given a ciphertext ${{\mathbf{c}}}$ encrypting ${\sigma}$ with a noise level that can be as large as, say, ${1/10}$, this method yields another ciphertext ${{\mathbf{c}}'}$ encrypting the same bit ${\sigma}$ but with a noise level at most ${\epsilon \cdot n^{polylog(n)}}$. Again, this is done using only public information and without knowing ${{\mathbf{s}}}$. Given this tool, we can evaluate a circuit of any depth by periodically performing this bootstrapping operation whenever the noise becomes too large.

The (ingenious though somewhat confusing) idea behind bootstrapping is to homomorphically evaluate the decryption circuit itself. Given a ciphertext ${{\mathbf{c}}}$, we consider the function ${f_{{\mathbf{c}}}}$ that maps ${{\mathbf{s}}}$ into the decryption of ${{\mathbf{c}}}$ using the key ${{\mathbf{s}}}$. Since decryption in our scheme is basically just an inner product, this is a function of the secret key which can be computed by a circuit of depth ${d=polylog(n)}$. (In fact it is a good exercise to show this can be done in ${O(\log n)}$ depth.) This means that given an encryption of ${{\mathbf{s}}}$ with noise level ${\epsilon}$ (which we can make available publicly without harming security), we can run ${f_{{\mathbf{c}}}}$ homomorphically to obtain an encryption of ${\sigma}$, whose noise level will be ${\epsilon n^{O(d)} = \epsilon n^{polylog(n)}}$. This completes the noise reduction, and viola – our hard labor resulted in fully homomorphic encryption!

A note on circular security. There were a few details we glossed over, and we’ll be happy to address them in the comments below. One issue is that the standard definition of security for encryption does not guarantee that it’s OK to publish encryptions of (functions of) the secret key, as we needed to do, and so one needs to assume a stronger notion known as key dependent message security. However, based on current knowledge, the assumption that our scheme satisfies this stronger notion seems very plausible. In fact, in Section 4.4 of his thesis Gentry gives a heuristic argument that any fully homomorphic encryption in fact enjoys such strong security, by showing this holds in the so called random oracle model. (Cryptographers might find this result surprising given that the encryption scheme does not use any hash function.) Nevertheless, showing a fully (non-leveled) homomorphic encryption from standard assumptions remains an important open problem.