Facebook eavesdropping Whatsapp? The everlasting tension between security and privacy

26/08/2019

While this is a non-story, it does raise an interesting conversation about security, privacy and the tension between them.

This one’s going to be philosophical. Might be due to my birthday and old age. Feel free to skip or join me on this somewhat different journey of an article…

A month ago, an article on Forbes, started a storm in a teacup. The article discussed a Facebook plan to thwart encryption in WhatsApp by adding client side moderation of sorts:

“In Facebook’s vision, the actual end-to-end encryption client itself such as WhatsApp will include embedded content moderation and blacklist filtering algorithms. These algorithms will be continually updated from a central cloud service, but will run locally on the user’s device, scanning each cleartext message before it is sent and each encrypted message after it is decrypted.”

A few days later, Facebook disputed these as something they’d never do.

Problem solved.

As I was working at the time on security related product requirements for one of my clients, this story has stuck with me, especially bringing home the challenge and the difference between security and privacy – in most enterprise scenarios – you just can’t have them both.

I’d like to raise here a few of my thoughts on this subject, and also look at some of the differences between individuals, businesses and governments. This will also mix with the fact that I am a father of two children at ages 9 and 12 who both use WhatsApp on their smartphones regularly (that’s the norm here in Israel – if I could, I’d wait with smartphones for them a bit longer).

Trying to understand security in WebRTC? Here’s a developer’s checklist you should follow:

Protecting our privacy

There’s an expectation here that whatever we do online will stay private, similarly to how things work in the real world.

It sounds great, but what exactly is privacy anyway? And how do we translate it to our daily lives without technology?

Without technology, conversations were transient, they never stored in any way, so people who talked to friends never really had the recording of that conversation either. They had no transcript either. And if we’re talking about technology, do we include the written word as part of a technology advancement or is that a pre-tech thing?

Today though, I can search and find a conversation with my daughter’s teacher from 5 years ago on WhatsApp. Is that a breach of the privacy of the teacher?

I don’t know the answers, and I am not advocating against privacy.

At the very least, I believe in encryption everywhere and in the concepts behind zero trust (ie – not trusting machines in your own network). Is that privacy? Or security?

The challenge with privacy – the idea that the things we do in private stay private – is when you try and mix it with security.

Securing our society

I live in a country which seems to be at constant war with its neighboring countries and with the many who want to harm it and its citizens.

When going to a shopping mall, I am used to having my bags scanned and my privacy breached. Why? Because in the context of where I live – it saves lives. In order to maintain security, some privileges around privacy cannot be maintained.

The challenge here is when do we breach privacy too much in the name of security. How far is too far?

Taking all this online makes things even more challenging. Can governments rely on the ability to “spy” on its own citizens in the name of security? Are there things that are better be spied on to make sure people don’t die? How far is too far?

Our society today values the lives of people so much. Is the life of a single person saved worth being able to spy on everyone?

Then there’s the bigger issue of corporations being multinational and social networks being global – who is securing society here? The corporations or the countries? Should corporations and social networks secure people against their governments or vice versa?

Securing our children

This one is where things get really tricky.

I’ve got kids. They’re small. I am in charge of them.

I’d like that whatever they do online will be private and ephemeral in its nature. Stupid stuff they do today shouldn’t haunt them as grownups. Good luck with that request…

On the other hand, how much privacy should they be allowed on social networks and on the Internet?

Should I be spying on them? Should I be able to filter content? Should I be alerted about questionable content that they get exposed to or are exposing themselves?

If anything, do my kids have the same privacy we so much value for ourselves against me being able to educate them on what’s out there lurking in the shadows of the internet?

There are different apps to help parents with that. Most of them are quite invasive. I decided to go with something rather lightweight here but I can’t say it lets me sleep well at night. Nothing really does when you have kids.

Securing our business

If you are a business owner, you somehow need to do what your employees do on your behalf. This affects how customers look and value your brand, so the privacy of your employees… well…

If a customer complains about a transaction, you’d like to go back and figure out the history of the interactions with that customer. If you’re in an industry that has strict rules and regulations, you might be forced by law to make a record of your employees’ interactions anyways.

How does that compare to the requirement for privacy? How does that fit with the march towards end-to-end encryption where the service provider himself (=you) can’t look at the interactions?

On one hand, you want and need encryption and security, on the other hand, this might not go hand in hand with securing the privacy of an individual employee. What works for consumers may not work in enterprise scenarios.

Our age of automation

Then there’s automation and machine learning and artificial intelligence.

As businesses, we want to automate as much of what we do in order to scale faster, better and at a lower price point.

As consumers, we want easier lives with less “steps” to make and remember. We’ve shifted from physical buttons on TVs to remote controls to voice control and content recommendations. At some point, these steps involve smarts and optimization that can only be obtained by looking at large collections of data across users.

In other words, we’re at a point in time that much of the next level of automation can only be introduced by collecting data, which in turn means breaching privacy.

Here are a few recent examples for all the great voice interfaces that are cropping around:

Who do we TRUST?

Here’s the funny bit – it doesn’t really seem like there’s anyone we can trust, while we need to trust everyone.

As an employee, I need to trust my employer. At least to some extent.

As a citizen, I need to trust my government. Especially in democracies, where I choose that government along with my fellow citizens. At least to some extent.

As a user of “apps”, I need to trust the apps I use. At least to some extent.

And yet, none of these organizations have shown that they should be trusted too much.

So in Blockchain we trust?

I beg to differ, at least today, with all the data and security breaches, along with other scandals around it. I can’t see this as a trusting environment.

Can we have both privacy and security?

Companies are looking for ways to bridge between the two alternatives.

It is interesting to see how Apple and Google each takes a side. Apple vying towards privacy more than Google while Google trying to use security and math to be able to offer some extent of privacy while maintaining its machine learning advantage and ability to serve ads.

Then there are cloud-based end-to-end encryption solutions for enterprises where the privacy is maintained by letting the enterprise hold the keys to its messaging kingdom and not letting the cloud provider have a peak. Cisco Webex for example does a good job here, going as far as giving granular controls also on where end-to-end encryption works on an individual or a group level.

Today though, we still don’t have a good solution that can offer both privacy and security and work well at all the levels we expect it to. I am not sure if we ever will have.

Why Facebook’s idea isn’t farfetched?

While Facebook said this isn’t even planned, the solution makes sense on many levels.

What are the challenges Facebook has with messaging?

  • The need to offer end-to-end encryption to its users. This is table stakes in social messaging these days
  • The need to play nice with governments around the globe. Each government has its own rules and unances. Europe has GDPR and the right to be forgotten. The US is somewhat less restrictive in its privacy policies. China has all encryption keys to its kingdom. Different countries offer different privacy profiles to its citizens
  • Facebook has everyone looking over its shoulder, waiting for it to fail with user’s privacy. And it fails almost on a weekly basis
  • It has competitors to contend with. These competitors are working on bots and automation to improve the user experience. Facebook needs to do (and is doing) the same. Which requires access to user interaction data at some level

To do all this, Facebook needs to be able to access the messages, read them, decide what to do about them, but not do it on its own servers so they aren’t “exposed” to the actual content and data of the user. Which is exactly what this no-news item was all about. Lets go back to the original quote from the Forbes piece:

“In Facebook’s vision, the actual end-to-end encryption client itself such as WhatsApp will include embedded content moderation and blacklist filtering algorithms. These algorithms will be continually updated from a central cloud service, but will run locally on the user’s device, scanning each cleartext message before it is sent and each encrypted message after it is decrypted.”

will include embedded content moderation and blacklist filtering algorithms – there will be a piece of code running on the mobile device of the user reading the messages. Here’s the news for you – there already is. That piece of code is the one sending the messages or receiving them. It is now going to be somewhat smarter though and look at the content itself, for various purposes – “content moderation” and “blacklist filtering”. → I definitely don’t want Facebook to do this for my content, but I really do want them to do it for my kids’ content and report back to me 🙂

these algorithms will be continually updated from a central cloud service – they already are. We call this software updates. Each release of an app gives us more features (or bug fixes). With machine learning, which these algorithms are doing, there’s a need to tweak and tune the model continually to keep it relevant. Makes perfect sense that this is needed.

will run locally on the user’s device – the content itself isn’t going to be stored in the cloud by Facebook. At least not in a way they can read and share directly. Which is what end-to-end encryption is all about.

This immediately reminded me of someone else who is doing that and offering it as an API. Google Smart Reply:

The Smart Reply model generates reply suggestions based on the full context of a conversation, and not just a single message, resulting in suggestions that are more helpful to your users. It detects the language of the conversation and only attempts to provide responses when the language is determined to be English. The model also compares the messages against a list of sensitive topics and won’t provide suggestions when it detects a sensitive topic.

It runs on device, reads all messages in cleartext. Applies algorithms and determines what to do next offering it as reply suggestions. All running on device, without interacting with the cloud.

To get to this level though, Google had to look and read a lot of messages and build a model for the algorithms it uses.

Figuring security and/or privacy in modern applications and services isn’t easy. It comes with a lot of tradeoffs that need to be taken throughout the whole process – from requirements to deployment.

Trying to understand security in WebRTC? Here’s a developer’s checklist you should follow:

Responses

Lauro Moraes says:
August 26, 2019

My thoughts here:

I believe that security in the digital context is the key to privacy in the digital context, but these are two conflicting expectations in a digital business model.

Big digital companies are only big due to data collection, they can even sell products, but their main product is their users who actually “give the raw material” (their data).

With respect to public (government or tied) and private companies, transparency and auditing of their employees’ activities in the context of their activities is at least expected. Whether “outside” monitoring is required or practiced when an employee is no longer on duty is another matter, perhaps legal.

I don’t think it’s absurd to use client-side algorithms that filter out potential spammers or misuse … I particularly hate receiving spam and abusive messages that most often come from senders I don’t even know about. But I have caveats about an automatic filter on the sender side … such a filter is only possible with ML which means that at some point some of the so-called “private” user data will be sent somewhere. Who guarantees that one person won’t read “that most private message” you send to another? And who guarantees that the user will be aware of this?

Blockchain, what does it refer to? The implementation of crypto-currency, Ethereum and cotratos or the concept? I think the “concept” is valid but there is no implementation today that fits the topic of this post. Would give a dissertation 🤔

We do not always see what we think is best for us with the eyes of others. Everything that comes to us through digital media influences us in our lives, you are right to worry about what comes to your children. They may not care about your thoughts today, but they did in a few years.

Regulations, well they exist. What remains is to adapt.

Congratulations, peace and health … ps: I love your publications!

Reply
    Tsahi Levent-Levi says:
    August 26, 2019

    Lauro,

    Thanks for the thoughtful comment and the kind words.

    With Blockchain, my main gripe is the need people have it sticking it everywhere, just to be able to say crypto or blockchain. From my reading and the little I know about it, the reason to use it is when you have trust issues, but there, it requires a marketplace of sorts with people willing to use it. The thing is, that in almost all cases today, trust already exist between the user and the service provider itself (to some extent), and using blockchain isn’t going to increase that trust or solve a problem – simply because services and service providers work in their own islands in the cloud and not in some federated manner like carriers do (where trust between carriers is mostly enforced through standards and legal contracts).

    Reply

Comment