As applications move to cloud computing platforms, ensuring data confidentiality and integrity becomes a serious concern. Users want to ensure that even if untrustworthy systems handle their confidential data, their data cannot be disclosed. In addition, users want to ensure that computations done by the (untrusted) systems are correct.
The latter problem is known as the problem of computation delegation, which is dear to my heart. I started talking about this problem in my previous post and promised a followup post on the subject, which is yet to come. In this post, however, I want to focus on the problem of confidentiality.
Unfortunately, breaches of confidential data are not uncommon: personal information of millions of people, such as financial, medical, customer, and employee data, is disclosed every year [Ver12]. These disclosures often happen because untrustworthy systems handle confidential data.
A powerful technique for preventing data disclosures is fully homomorphic encryption (FHE). In a breakthrough work, Gentry [Gen09] constructed the first FHE scheme (which shortly after led to several followup work). In my view, this result is one of the most important results in modern cryptography (and maybe even one of the most important results in theoretical computer science). Fully homomorphic encryption (FHE) allows the remarkable ability of computing on encrypted data without access to the data itself. Namely, given a ciphertext Enc(x) and given a function f, one can efficiently compute Enc(f(x)) without learning anything about x.
Hence, users can now use FHE, and send to the untrusted systems an encryption of their data, rather than the data “in the clear”, allowing the systems to perform computations on this data, without disclosing any confidential information.
Due to the importance and usefulness of FHE, there are several researchers who are now trying to implement FHE, and make it applicable in practice. However, it turns out that in many cases, the bottleneck of the applicability of FHE is not in its actual runtime. Rather, its lack of practicality stems from the fact that it makes the system handling the data completely “blind”.
Consider the example of spam filters: Suppose we would like to hide our email content from Gmail, since we are concerned with data disclosure. We can of course encrypt all our emails using FHE. However, a problem with this solution is that it prevents Gmail from using spam filters. Gmail can of course run the spam detection algorithm homomorphically on an encrypted email and obtain an encrypted result; but it cannot tell if the algorithm deems the email spam or not.
Ideally, we would like to allow Google to learn if an email is spam or not, but learn nothing else about the email content. This can be achieved using what’s called functional encryption, a notion that originated from the work of Sahai and Waters [SW05], and later formalized by O’Neill [ON10], and by Boneh, Sahai and Waters [BSW11].
In functional encryption, anyone can encrypt a message Enc(x) using the public key, and the holder of the (master) secret key can provide keys for functions, for example, sk_f for a function f. Anyone with access to a key sk_f and a ciphertext Enc(x) can compute f(x) “in the clear”. The security requires that an adversary does not learn anything about x, other than the computation result f(x).
It is easy to see, for example, how to solve the above spam filter problem with a functional encryption scheme. A user Alice publishes her public key online and gives the Gmail a key for the filtering function. Users sending email to Alice will encrypt the email with her public key. Gmail can now determine by itself, for each email, whether to pass it along to Alice’s mailbox or to deem it as spam, without ever learning anything about Alice’s email (other than the fact that it was deemed as spam or not).
So, the next question is: Do we have functional encryption schemes? And the answer is yes and no. We have functional encryption schemes that allow the release of only one single function key [SS10,GVW12,GKPVZ13]. This seems to be enough for the above application of spam filters, since the only key that needs to be released is for the spam detection function. However, there are many other important applications where a single function key is not enough. For example, in database applications, often blinding the database completely is not practical since it requires the database to go over the entire data every time it wants to do some computation (such as retrieve an element). Therefore, the user may want to give the database some information that, on the one hand, will not compromise security, and on the other hand, will allow the database to search the data efficiently, thus avoiding this “worst-case curse”. For this application, we need a functional encryption scheme that allows the release of many function keys. For example the data owner may want to release many keys corresponding to a comparison function f_y, which on input x checks whether y>x.
Unfortunately, we have multi-key functional encryption schemes only for the class of inner product functions (functions f_y that on input x output 1 if the inner product of x and y is 0, and output 0 otherwise) [BW07,KSW08,SSW09,AFV11]. Moreover, there is a negative result by [AGVW12] that shows that there does not exist a multi-key functional encryption scheme for functions that are “not learnable”. However, this negative result is only with respect to a simulation-based definition. Therefore, a very interesting open problem is to construct a multi-key functional encryption scheme for a large class of functions (such as all poly-size circuits), under a different security definition such as an indistinguishability-based definition.
I am currently obsessed with this problem and if anyone has ideas, I would love to hear!
[AFV11] Shweta Agrawal, David Mandell Freeman and Vinod Vaikuntanathan: Functional Encryption for Inner Product Predicates from Learning with Errors. ASIACRYPT 2011, pages 21-40.
[AGVW12] Shweta Agrawal, Sergey Gorbunov, Vinod Vaikuntanathan and Hoeteck Wee: Functional Encryption: New Perspectives and Lower Bounds. Cryptology ePrint Archive, Report 2012/468, 2012.
[BSW11] Dan Boneh, Amit Sahai and Brent Waters: Functional Encryption: Definitions and Challenges. TCC 2011, pages 253-273.
[BW07] Dan Boneh and Brent Waters: Conjunctive, subset, and range queries on encrypted data. TCC 2007, pages 535-554.
[Gen09] Craig Gentry: Fully homomorphic encryption using ideal lattices. STOC 2009, pages 169-178.
[GKPVZ13] Shafi Goldwasser, Yael Kalai, Raluca Ada Popa, Vinod Vaikuntanathan and Nickolai Zeldovich: Succinct Functional Encryption and Applications: Reusable Garbled Circuits and Beyond. STOC 2013.
[GVW12] Sergey Gorbunov, Vinod Vaikuntanathan and Hoeteck Wee: Functional Encryption with Bounded Collusions via Multi-party Computation. CRYPTO 2012, pages 162-179.
[KSW08] Jonathan Katz, Amit Sahai and Brent Waters: Predicate Encryption Supporting Disjunctions, Polynomial Equations, and Inner Products. EUROCRYPT 2008, pages 146-162.
[ON10] Adam O’Neill: Definitional Issues in Functional Encryption. Cryptology ePrint Archive, Report 2010/556, 2010.
[SS10] Amit Sahai and Hakan Seyalioglu: Worry-free encryption: functional encryption with public keys. ACM CCS, 2010, pages 463-472.
[SSW09] Emily Shen, Elaine Shi and Brent Waters: Predicate Privacy in Encryption Systems. TCC 2009, pages 457-473.
[SW05] Amit Sahai and Brent Waters: Fuzzy Identity-Based Encryption. EUROCRYPT 2005, pages 457-473.
[Ver12] Verizon RISK Team: Data Breach Investigations Report 2012.http://www.verizonbusiness.com/resources/reports/rp_data-breach-investigations-report-2012_en_xg.pdf.
10 thoughts on “Challenges in Outsourcing Computation”
I do not fully understand the restrictions on the function in the single key functional encryption scheme case. Are these functions described via some universal notion, such as Turing Machines or circuits? Why can’t we pass a single key for a function that is “padded” in a way that allows it to compute “spam filtering” on all but the most significant bit when the MSB is 0 and “content based advertising” on all n-1 LSBs when the MSB is 1?
Addendum: or the function could “output” multiple concatenated strings, one for spam, one for advertising, etc.
Thanks for your question. The functions are usually modeled as circuits. It is true that you can “encode” many functions in one (as you suggested), but you are allowed to release only one function key. So, you cannot first release a key corresponding to some f_1, and then (adaptively) release another key corresponding to f_2, and so on. Also, in all known constructions the efficiency degrades with the number of output bits of the function. Namely, one needs to set a parameter t corresponding to the number of output bits of the function, and the efficiency of the scheme depends (polynomially) on t.
It feels like there is still some flexibility because the functions are so universal.
Why not use a “bootstrapping” technique similar to Gentry?
Let the circuit being sealed by the key k0, called A, represent a function that decrypts and evaluates two (e.g. spam and ad) circuits, B and C, with provided input keys (the first 2n bits of input to the circuit A).
Certainly there would be blowup from this, and also the non-uniformity of the model might get technical at times.
Hi Yael, There is some similarity between the task of allowing computations on your data without revealing any information (cryptography) that this post is about, and the task of giving a lot of information while keeping some crucial parts private (privacy), that was mentioned in some earlier posts. It looks that both areas are relevant for cloud computation. Are there some connections/ middle grounds/ etc. between cryptography and privacy?
This is a great question. Indeed in both, we want to keep some information confidential while allowing for some computation. Surprisingly, it seems that the techniques used in cryptography and privacy are often very different (though I am by no means an expert on privacy). In particular, most of the works in privacy are information theoretic, i.e., do not rely on any computational assumption, whereas most works in cryptography (and all works mentioned in my post) are computational.
I would like to have a good answer to your question, and hopefully some of our privacy experts will be able to help out…
Great question which deserves its own post but just to give an initial answer here goes: the privacy of Cryptography and the privacy of Differential Privacy are complementing. The former tells us how to evaluate a function when some of the inputs are secret in a way that will minimize leakage of information. Differential Privacy asks which functions could be evaluated without “too much information” being leaked and how to perturb the function such that utility is mostly preserved but more privacy is obtained (this is very rough and informal). This also explains why the latter is “more information theoretic than cryptography” as observed by Yael (though looking more deeply, differential privacy has a large computational component to it).
As an example, consider an ascending auction. Participants in the auction hear a sequence of bids which give a lot of information about the inputs of the bidders (their valuation of the product and their bidding strategies). Instead, consider a sealed-bid second-price auction where all bids are confidentially given to the auctioneer and the auctioneer only announces the winner and the second price (which is what the winner should pay). If we trust the auctioneer then much less information about the inputs is revealed. What cryptography allows is to do the same without a trusted party. This is what secure evaluation implies (and much of the further study, as the one in the original post is about various efficiency considerations including avoiding interaction which for some applications is simply impossible). But even given a trusted party, or alternatively using cryptography, there is still a privacy consideration: while the inputs themselves are kept secret, some information is leaked through the output. We learn who won the bid, a lower bound on the winner’s bid and the exact bid of the second one (which given auxiliary information we may have a good guess on its identity).
So while cryptography protects direct leakage of the inputs, cryptography cannot protect what is implied about the inputs from the output. The kind of privacy studied by differential privacy is exactly that.
Great answer, thanks Omer!
This is definitely one of the most interesting topics about outsourcing and computation. It’s great to learn from you guys! I liked Omer’s last answer and really learned a lot. These things weren’t clear to me before I entered the world of outsourcing (or maybe because I mastered something else), but I’ve learned so much now. I think I would be reading more similar posts like this in the future. Thanks, everyone!