Following my last post on a few things to consider when going through the exercise of creating your GDPR data processing inventories, I realised that there is much more that can be said on this particular topic. As such in this post, I would like to provide more specific information, as to the best practices when structuring your data processing inventory.
A data processing activity may use multiple personal data elements and those data elements may come from different, files and documents. These same files and documents may be used for other data processing activities. Having established we come to the conclusion that if the data processing inventory is not properly designed, we may be in for a lot of repetitive input. As such it is best to approach this like a database architect.
Before we dive in, it is important to arm yourself with a very useful concept which will be covered in Section 1. If you’re already familiar with the concept of database normalisation, feel free to skip to section 2.
Section 1: Database Normalisation concept.
In databases there is a concept knows as normalisation. It’s scope is to reduce the amount of duplicated information, having smaller tables (with less columns), and at the same time maintaining the relationship between the different related records intact.
To expand on this concept for the non-techies, let’s imagine you have a customer and you want to record the purchases of that customer. In a non-normalised concept you would have 1 table with the following details:
As you can see a lot of the information is repeated multiple times, with no additional value. In a normalised design you would have the following tables instead.
Normalisation allows you to reference existing information rather than input it all over again. This technique is a heaven sent when it comes to structuring and designing your data processing inventory as it will avoid a lot of unnecessary work and repetition.
Section 2: Designing your inventory
Step 1: Identify the sections
The first step is to determine what information you will need to include in your data processing inventory and split them into sections.
Here are a few of the sections I identified when going through this exercise.
- The data processing activity itself.
- The documents/Files upon which the data processing activity is conducted.
- The assets where these documents/files/data is stored, collected from or output to.
- The retention period per document/file.
- The vendors and their contracts (for proof that data transfer is legalised) that provide such assets or with whom any data is shared.
Step 2: Identify relationships between sections
In other words, which sections are directly related to which sections and is this relationship ‘1 to 1’, ‘1 to many’ or ‘many to many’
- A data processing activity may use multiple documents.
- The same document may be used for multiple data processing activities.
As such the relationship between activities and documents is many to many.
Other relationships within the mentioned sections:
- A document/file may be stored on multiple assets.
- Each of the document, asset combinations may have a different retention and archival period determined.
- A vendor may provide one or more assets.
Once you have identified these relationships take the time to review and determine whether the sections can be segmented further. The better your segmentation, the more normalised your inventory will be and the less repetition and data collection and input you will need to do. The extra time spent on this step is a worthwhile investment.
Step 3: Identify the exact data elements per section
In this step you’ll need to think carefully about the information you need and which section it would fit. Here are a few data elements that I included in my sections. Note that this is by no means comprehensive but it should give you enough of an idea to get you started.
Data Processing Activity:
- Type of activity – Collection, Sharing, output, manipulation etc….
- Description of the activity
- Categories of data subjects impacted by this activity
- What is the purpose of this activity
- What type of personal data does this activity make use of
- Who is the data controller for this activity
- Who is the data processor for this activity.
- What is the lawful basis upon which this activity is taking place
- If the activity is sharing of data, to whom is it being transferred?
- What are the data transfer legal safeguards in place?
- Does this activity warrant a DPIA?
- What files/documents would someone need to carry out this activity? (Link to Documents/Files Section)
- Name of document
- From where is the information within this document/file collected?
- What personal data elements are present within this document?
- Where is this document/File stored? (link to the Assets Section)
- For how long is this document stored? (link to the Retention and Archival Section)
Retention and Archival
- Retention period
- Archival period
- Frequency of review
- Responsible to review
- Name of the asset
- Asset owner
- Security mechanisms
- Business continuity mechanisms
- Vendor (link to the Vendor section)
- Vendor name
- Is a contract in place
- Data processing agreement in place
- SLAs in place
Step 4: Standardise and simplify
Once you have identified all the information you would need to have within the inventory, the next step is to standardise and simplify as much as possible. Keep in mind that the purpose of the inventory is not only for record keeping obligations but also to permit you to find out which data elements are on which document and on which asset. This could be a really good tool in case a breach occurs, to narrow down the surface area and quickly zone in on what data elements may have been breached. As such the data activity inventory needs to be structured in a way that it can be easily reported on at a moment’s notice. When it comes to systems, even if you’re simply using excel, the more standardised the input the easier data can be filtered, segmented and grouped. As such it is important to go through each item and identify whether the input can be standardised.
For example ‘Categories of data subjects’ can be set-up as a select drop down field with the following options:
- General Population.
This way, anyone filling up the data processing record will select one of these options ensuring that the input is the same across the board, no typos, no differing terminology, easier to populate and once completed you can easily filter and report.
Step 5 – Set up your tool/repository
Now that you have all your data elements, options and relationships specified, all that is left is to choose the tool or repository where the data will be recorded and set it up. The above can be achieved through a spreadsheet program but I highly recommend investing in a software that is more geared towards this type of data input in this case.
Once you have your repository designed an set-up, all that’s left to do is go out there and make sure you record all the data processing activities involving any form of PII…. fun times…. or not…