Risk without the benefits - Data minimisation is not the solution



Data minimisation and the usage of metadata are being actively investigated by countries all over the world in relation to real-time reporting. The reason for their interest in such solutions lies in the need for more secure systems which may be more resilient regarding data leaks due to cyber attacks, human errors or even snoopy public officials. Although these systems may indeed require less data in order to perform the required tasks, they are not completely secure nor increase collection beyond what is possible with confidential systems.

In this article, we will first analyse the concepts of data minimisation, metadata and the reasons why they may be considered more secure in relation to conventional real-time reporting solutions. Next, we will explain how such systems are nonetheless still prone to data leaks while their effectiveness regarding VAT collection and compliance does not significantly increase compared to a fully confidential system.

Data minimisation: what is it?

Data minimisation and metadata are increasingly being investigated in relation to real-time reporting solutions. Their scope is to reduce the amount of data collected in order to make the system more secure in case of cyber attacks or data leaks and to comply with privacy and confidentiality laws and guidelines. Data minimisation and metadata have some similarities in common and sometimes may be confused.

According to the British Information Commissioner’s Office, the principle of data minimisation requires data to be “adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed”.[1] According to this principle, real-time reporting systems should only collect the smallest amount of data which is necessary in order to perform checks and controls on the invoice information and to assess the rightful VAT payment.

Maybe less data, what about metadata?

Connected to data minimisation is the concept of metadata. Metadata is data which describes and gives information about the original data, think for example of the date, size and source. Similar to data minimisation, metadata is often a smaller amount than the original set of information. Nevertheless, the main difference is that metadata requires an analysis of the original dataset in order to derive information which were not directly included in the original.[2]

Different types of businesses already use metadata in order to determine the consumer type of their clients and to propose them with suitable tools they would likely be keen to buy. For instance, Amazon uses metadata by registering the products you purchased or just scrolled through. Every new item which Amazon proposes to its clients without directly looking for it in the search tab is the result of a metadata analysis. Also supermarkets or telecommunication companies may constantly acquire data and analyse them in order to propose better business services.

Nevertheless, the amount of metadata which is collected all around us is astonishing. In 2009, German politician Malte Spitz required his telecom company to hand in the data collected by the company through his mobile phone: the company had collected his geographical location and what he had been doing with his phone more than 35,000 different times over a period of six months.[3] However, the amount of data constantly collected by businesses, companies and governments is becoming increasingly worrisome and its use may not always be only business driven and unharmful in the future. Therefore, even with data minimization strategies, the threat of leaking valuable data extends to metadata.

Data minimisation: is it more secure?

These methods of data collection are often considered more secure than storing a large amount of original data. To highlight this concept, consider the following differentiation between original data, data minimisation and metadata regarding an invoice:

Original data Minimised data Metadata
Company name Company name Timestamp
Unit prices VAT due to the tax authority Issuance source (IP address, IT system)
Quantity Buyer’s information -
VAT rate - -
Buyer’s information - -

It is clear that either by using the original dataset, minimised data or metadata, a considerable amount of information ends up in the system. In case of a cyber attack or a data leak, the whole amount of data available in the system may be disclosed. Also in this case, malware may be able to leak a lot of information which may influence the economic activity of a company or also disrupt the production chain of the whole economy, as is often the case with ransomware attacks.[4]

Does data minimisation for real-time reporting provide benefits?

The scope of real-time reporting is to reduce the VAT gap and increase compliance. This may happen through the automated analysis concerning the right application of the VAT rates for each product. However, when data is minimized and VAT rates are not reported, this benefit would no longer be present. This choice questions the complete idea of data minimisation because the system would still have risks without any actual benefits.


As we have seen, data minimisation and metadata do not ensure the security of a real-time reporting system because they still make use of a great amount of data and potentially metadata, while reducing the benefits derived from real-time reporting. If the most important reason to share invoice data is not required - i.e. checking the rightful application of the VAT rates - then why should data be uploaded in the system in the first place? Especially as solutions exist that achieve the same as the initial goal of data minimization.

Modern encryption methods allow for the implementation of a confidential real-time reporting system which does not collect any data while ensuring both the reduction of the VAT gap and the increase in business compliance. Such systems are already a reality and may be the best answer to the rightful concern of collecting a too high amount of data.

In case you want to learn more about how summitto’s real-time reporting system exactly works and how it benefits both the public and private sector you can visit our website. For questions, shoot us a message at info@summitto.com

[1] https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/principles/data-minimisation/

[2] https://www.lexico.com/definition/metadata

[3] https://www.eff.org/node/81907

[4] https://www.reuters.com/technology/200-businesses-hit-by-ransomware-following-incident-us-it-firm-huntress-labs-2021-07-02/