The aim of this document is to help other parties to make decisions regarding scalable privacy-aware and -respecting software architectures. In order to achieve this, we will first explain the general expectations for the SPECIAL platform, then the objectives that were extracted from those expectations, and, finally, how those objectives amount to the SPECIAL-K architecture. (For more information, please consult Deliverable 3.4.)
The goal of the SPECIAL Platform is to give data subjects control over their own data. This means that they should be able to visualize the consent they have given, and that they should be able to change their consent at any moment in time. They should also have the ability to obtain a historical overview of the processing of their data in light of the consent given at the time.
Additionally, the data processors need to be able to log their activities for review. Ideally, this process requires the least possible amount of inhibition on their current work process. The amount of activity logging should be able to grow. Lastly, the data processors would also benefit from a historical overview of consent in processing logs.
- Allow users to manage their own consent
- Control personal data processing
- Low impact access
- Relatively large volumes of consent
- Fast compliance checks
- Historic logging of consent management
- Historic logging of processing activities
The architecture and its rationale
Since we want to provide a scalable architecture, we first looked at the different points that might proof to be bottlenecks when scaling.
We identified the applications that are logging processing activities and the checking of the consent as being the most likely sources. The only other input comes from the consent management by the user and we assume that the users will reevaluate their consent far less often than that their data is going to be processed.
The other expensive operation would be to calculate a historical view of the data processing activities.
Since we want the applications that are processing data to be as little impacted as possible by the need to register their activity, we have decided to decouple that process from the actual checking of consent to that processing. This is done by the usage of a decoupled message broker. In this case, we chose Apache Kafka, which is known to be able to handle large volumes of data and to scale well. Kafka also provides us with, given the correct configuration, a high performance audit trail.
Next, we want to be able to handle large volumes of consent checks. To get there, we observed that consent is generally independent from other consent. In short, if user 1 consents to something, this should in no way affect user 2. Since the number of users is high and the number of applications is expected to be multiple magnitudes smaller, we will copy the full application data to all consent checkers and keep that up to date. The user data can then be sharded over multiple applications. Because we chose Kafka as a message broker, we will use Kafka groups to distribute the messages of the consent checkers.
Finally, a prerequisite for processing is an identity-aware personal data inventory which needs to keep an up-to-date index of all personal data, their location, and the corresponding identities. In the SPECIAL-K architecture, any request for personal data would go through a personal data gateway, which would in turn consult the inventory for relevant metadata, including ownership, making policy compliance checking per data subject possible.
Depending on the intended use case, we distinguish between two conceptually different, yet implementation-wise similar architecture setups: ex post and ex ante.
Ex post compliance checking. As they process personal data, applications write the processing events to a processing log, which is then inspected for compliance.
Ex ante compliance checking. Applications submit their requests for processing of personal data, which are then inspected for compliance. The answers are then fed back to the requesting applications.
Figure 1. SPECIAL-K architecture setup for ex post compliance checking
Figure 2. SPECIAL-K architecture setup for ex ante compliance checking