Case: Robust Data Storage

How we built a breach-resilient, encrypted document storage system.

Technologies used

  • HashiCorp Vault
  • S3 object storage
  • C#/.NET
  • MS SQL

Background

The client needed a more robust and feature-rich storage system for managing documents and other files. Key requirements included file versioning, tenant data separation, auditability and tamper-resistant storage. The files were to be encrypted and the system architecture should be resilient against potential data breaches.

Challenges

We encountered the following challenges during the project:

  • The zero secret problem: Secure systems rely on encryption at rest and in transit, which in turn requires that a server can safely store the keys or other secrets it needs. We can simplify this process by deriving all secrets from a "master secret". The challenge then becomes: how can we ensure that this initial master secret has not been intercepted or tampered with, and that the server is its sole recipient?
  • Key rotations and WORM: WORM 'Write once, read many' guarantees that data is never changed after its insertion., the mechanism used to ensure an object hasn't been tampered with, poses an interesting problem. How can a system perform key rotation without changing the underlying encrypted object?

Solution

Prior to development, we scoped the project and designed an architecture that satisfies the requested features and aligned with the threat model.

The threat model considered the following threats:

  • External threats such as man-in-the-middle or supply chain attacks and unauthorised access to servers or other infrastructure.
  • Internal threats: Employees or developers accidentally or maliciously gaining access to unauthorised documents and data.
  • Other threats: Implementation bugs, misconfigurations, or software vulnerabilities that could lead to unintended data leakage or cross-tenant access.

Data is stored as three separate fragments in such a way that no meaningful information can be leaked if one or even two of the three services were to be compromised simultaneously. The key vault handles encryption and decryption, the database that stores the encrypted DEKs Data Encryption Key, the key that encrypts the actual data., and other metadata, and the S3 server stores the encrypted data.

Two important encryption techniques are used in this system, envelope encryption, and authenticated encryption.

To perform a key rotation, without changing the existing encrypted object, we used a technique called "envelope encryption". Instead of directly encrypting the data with a fixed key, we generate a DEK, encrypt the object with this key and then encrypt the DEK itself with a separate key. To perform a rotation, we now only have to re-encrypt the DEKs instead of the encrypted data.

In most encryption schemes, the key solely dictates whether you have access to some data, this can lead to various problems where it is in theory possible to get cross-tenant access. We therefore bind all objects to a context, e.g. the current user, so that objects cannot be accidentally decrypted by a different user.

Mitigating data breaches

Database: The database stores the various metadata fields required for domain logic, table joins, and the encrypted DEKs. If this service is compromised, the encryption keys can simply be rotated and the old key can be destroyed.
Key Vault: We chose to use HashiCorp Vault to manage key encryption keys and data encryption keys. The vault is only responsible for unwrapping DEKs, and generating more keys. If this service is compromised, the system merely has to rewrap all EDEKs to ensure no data could have been leaked.
S3 storage: The S3 standard A cloud service originally created by Amazon to store large amounts of data, which later became a standard interface for data storage. provides versioning, WORM, and retention policies out of the box. Should the S3 store be compromised, we simply rewrap all objects with new data encryption keys. This effectively performs "crypto-shredding", rendering the stored data unrecoverable.

Conclusion

The final system met the client's goals by delivering a secure, tamper-resistant, and scalable document storage platform. Through the use of envelope encryption, authenticated encryption, and strict separation between services, the system ensures that data remains protected even in the event of a partial compromise. As a result, the client is better equipped to handle sensitive data and maintain a high degree of security.