Will fully homomorphic encryption break the big data market?

What is fully homomorphic encryption?

Fully homomorphic encryption (FHE), often considered the holy grail of cyber-security by the experts (Van Dijk and Juels, 2010)[1], is an emerging, but not new concept which allows computations on encrypted data.

Taking regular encryption, if the encrypted data would be required to be modified, the data would first need to be decrypted, altered with the desired computations and then re-encrypted.

FHE allows applications to perform the desired computations without ever having the need to decrypt the original data, preserving the end-to-end privacy.

Fully homomorphic encryption diagram

So what is the big deal about it?

Due to the nature of FHE not requiring the data to be decrypted to perform calculations, privacy becomes a lot easier, where users would be able to delegate computation to 3rd parties such as cloud providers, where the cloud provider wouldn't have access to the users data at all, keeping the secrets secret only for the data owner who has the private key.

In today's world, security plays a big part in our every day life, especially when it comes to online security. Take Google for example, in the midst of 2010, a Google employee was fired for reportedly accessing communications of at least four minors with Google accounts, spying on Google Voice call logs, chat transcripts and contact lists[2]. All of our messages, photos, even our life revolve around the internet in one form or another.

An implementation of FHE might have prevented this, where messages exchanged would have been performed over some FHE supported protocol, where the data is only readable by the data owner, but communication could still proceed via the use of FHEs unique behaviour of being able to compute on encrypted data.

Drawbacks of fully homomorphic encryption

The main problem with fully homomorphic encryption at this day and age is performance. Simply put, FHE is currently quite inefficient. Some simple operations can take anywhere from seconds to hours[3]. There are trade-offs to speed it up, but at the cost of utility, and vice-versa, sacrificing performance for better utility. The security could also be traded for better performance and utility, but what's the point? This sort of becomes quite similar to the CAP theorem within databases, where any distributed data store can only provide two of the following three guarantees: consistency, availability, partition tolerance. Or in FHEs terms, performance, utility, protection.

FHE and the big data market

An internet user may not think twice about the data that they share on the internet. However, the tin-foil security people will know how big the big data market actually is. In U.S. alone, the big data market is estimated at $50.1 billion in 2021, with predictions that it will reach $234.6 billion by 2026[4].

Big data is generated from a variety of sources, such as, RFID readers, social networks, sensor networks, internet text and documents, call registers, internet search indexing, scientific research studies, medical records, military surveillance, and eCommerce among others. Big data comprises gathering, analyzing, and using massive amounts of digital information to improve business operations, such as: Getting a 360-degree view of their audiences.

Big data life-cycle

Big data also comes with a risk. If data falls into the wrong hands, it can be used for targeted phishing, scams and spread disinformation[5]. Every year the number of data breaches increase, which just shows how important it is to keeping our data secure.

With the help of FHE, people would be able to ensure the safety of their data within sites such as social media or even within the government. This begs the question: what will happen to the big data companies if FHE becomes a modern practice and the performance issues have been resolved?

How would the big data companies be able to harvest the data if the data is encrypted? Without having access to the decrypted data, big data companies would not be able to perform the normal processing of data, no phone number extractions, no medical records, no usable digital information. This would surely spell out a disaster for them, as their data is no longer a 360-degree view of the audience, but rather of some scrambled nonsense bits and bytes. Benefit for the common internet user, destroying the big data market.

On the other hand, there may be a push to focus elsewhere. Rather than the actual information, big data could shift their focus to the functions performed on the data. A move towards processing and analysing the functions performed on data encrypted with FHE could still allow analysis and processing of data, but move the focus towards what is actually being done with the data.

Because companies now wouldn't be dealing with what may have been private users data, this could even spark a wider boom in the big data market, where it's easier for companies to be on track in terms of different laws and governance around data, such as GDPR, and making data breaches far less impactful for the regular customers.

But for now, we are still far away from this dream that fully homomorphic encryption promises us.