handslive | What does it take to make him post something? (Reply)

You may have heard about the recent breach at Air Canada. Because Air Canada recently purchased Aeroplan, it turns out that I have credentials stored in their service. I don't use the app, so I'm not even likely to be part of the subset of users affected. And that's not why I'm posting anything here. Instead, it was a tweet from Ann Cavoukian stating that the breach wouldn't have happened if Air Canada had encrypted all of their customer's data at rest. And maybe I should tweet something about it? But my response is not a short aphorism.

Encryption of data in a case like this is about instituting an additional measure of access control, supplemental to existing controls. But applications are built of components that are in a sense layered (at least conceptually) on top of each other. So where we provide encryption affects what access is protected and how the system overall is affected. This is meant to be a quick tour, so I`m going to make broad, sweeping generalizations. This is based heavily on my own experience.

A web application will be made up of physical system components (CPUs, memory, storage, and network) that are carefully divided up into separate logical components that may then be further divided up into virtual components. A very simplistic application will have some sort of persistent storage of information, usually a database. The application will maintain an internal model of this information, which it manipulates to collect, manipulate, and display in a user interface. There may be application or data services that separate the user interface further from the database, but this is the bare bones.

We can very easily check the box that says all the information Air Canada was storing was encrypted by encrypting the system storage used by the database. That might be the physical or virtual storage volume, or it might be the logical or virtual file system. The box is checked, but it has absolutely zero effect on the data breach they experienced. That's because the operating system (or some other underlying component) is encrypting and decrypting the data on behalf of the database, which doesn't even know there's any encryption. All it sees is the plain, unencrypted content. What we're using encryption for here is an additional level of separation from other organizations running in the same environment (like Amazon Web Services or Microsoft Azure). I have keys to my storage and you have keys to yours. If we somehow get access to each other's storage, the keys won't encrypt or decrypt the content.

Alternately, I could get the database to do this encryption directly using a method called Transparent Data Encryption. Here, the database knows the keys and can encrypt and decrypt directly, but the applications and users that access the database are unaware of it. They only see the unencrypted content. Typically, the applications and users need to be granted access in order for the key to be accessible to them, so this method is a good way to share a database with multiple users or applications while keeping the content separated. Done properly, it can also make sure that the system administrator for the operating system can't see the raw content on the file system and would need to be granted access to the database. If there's a need for segregation of duties, this helps achieve it. But it has zero effect on the breach because that wasn't an attack on the operating system. The attack came through the application, which transparently accesses the encrypted content.

Next, we could try to perform column level or attribute level encryption. This can be done in a number of different ways. I can configure the database to perform this operation, but it will likely require direct programming in the database through stored procedures or customized database queries to invoke the encryption and decryption operations. I can program the application to perform this operation. The database will only see encrypted content for each piece of data we encrypt, but an extra layer of processing is needed when storing and reading the information. Does this affect the breach? It depends.

From CBC's article (and other articles about the breach), it seems there was a flaw in the API used by the mobile app. They don't discuss the nature of the flaw, but let's work through some possibilities. For this I'll consider two kinds of application flaw (there are others, but they fall into similar camps for this case).

The first is SQL injection. SQL is the common language of relational database queries. An injection attack implies that the attacker has found a way to submit his own queries to the database through a poorly protected part of the application. This bypasses many controls that the application may otherwise try to impose. If we can only retrieve decrypted content by carefully invoking stored procedures (not ordinary SQL) or if the application itself was responsible for the encryption, then an ordinary SQL injection attack will be thwarted (and Ann Cavoukian is correct that Air Canada's data would be protected).

The second is broken access control. In this case, the application has a flaw in how it authorizes requests. This depends heavily on the design of the access control system. Maybe the application expects some information about the user or the record being accessed to appear in the request in an encoded way (maybe encrypted, but maybe not). It trusts that this encoding is strong enough that it trusts any value it sees implicitly. So, if the attacker can figure out how to encode the request, then the application will grant access. In the dumbest forms of this weakness, there's no real encoding at all. Instead an account number or a short session value is used and the attacker can add or subtract a little from their own request values to see other people's information. Maybe the application has a simplistic "gateway" of sorts that enforces access, but it's possible to go around the gateway and get directly at the raw API where no controls are enforced. The problem with broken access controls is that the application will perform the encryption and decryption steps faithfully because it's been fooled into believing the access is legitimate. So, this approach might not fix the breach problem, even though we're encrypting the data.

A caveat to encrypting each individual attribute, though. One reason for storing data in a database is that we can index the data we've stored to allow for searches using only part of a value, like searching on "han*" to get "handslive". We can correlated records together for analysis or specific business functions. This kind of thing only works on unencrypted data. It's hard to search for "han*" if we've actually stored "m1/234m,.zuxcovpu" instead of "handslive".

There are some approaches for solving this problem, but I have to say they aren't typically provided by the commercial products on the market unless you're specifically paying for them. Many solutions don't go much further than Transparent Data Encryption. Credit card numbers are an exception because of the standards imposed by the payment card industry. And many organizations accept, as a trade off, that they won't be able to use card numbers directly for partial searches (they'll usually store the last 4 digits separately and unencrypted) or for additional analysis (fraud management requires a bit of extra work, too, for example).

I do think that some of the content stored by Air Canada could definitely have been encrypted and protected to a level similar to credit card numbers. Passport numbers and Nexus card numbers, for example. But to encrypt everything? That runs the risk of seriously hampering things. My main point is that encryption isn't magical security dust that protects against everything.

What does it take to make him post something?

Post a comment in response: