Since information helps to take strategic decisions and sustain your business continuity it’s vital to control the data you own. Without proper information you can’t run your organization in a smooth manner. The number of data breaches continues to rise and companies need to protect their data more than ever. Organizations face challenges to take counter measures to data theft and data loss. Not all of them have implemented a data classification technique. In this article, you will learn more about data classification and how it helps to protect your valuable information.
Every organization which produces a certain product or service distinguishes core processes and supporting processes. Core processes might require critical information while supporting processes require less critical data. Organizations which do not have proper data classification (strategy) don’t know exactly which data is critical and which is not. Therefore, they don’t know how well the data contributes to their core business processes. Besides that, they also don’t know how well they need to protect it and how to define which person or system has access to it and which not.
Above mentioned organizations need to decide about data protection measures every time so they lose valuable time and money. Processes around data protection are not standardized, discussions around it are not structured. Basically these organization don’t know much about their data while it might be their most valuable asset.
Without all of this in place, you can’t fully benefit from your data. Migrating to the cloud is a big risk since you don’t know how to protect your data well, let alone take strategic decisions that push your organization further. Data classification can help to mitigate these issues.
Some of the benefits of having a proper data classification technique are:
- You can separate important data versus not so important data. For example, the privacy related data from your top 10 customers is much more valuable to your organization than your internal photo-book.
- Data which is classified gives you a framework to take protective measures for each piece of data in a structured way.
- The security awareness of your employees grows since they can quickly identify which information is highly critical and which is not. Everyone understands that information which is labeled “confidential” should be treated with much care compared to public information that is widespread.
- Companies which operate in the EU need to have their data classified to comply with the GDPR and other regulations. If you do not have it classified, you are not “in control”.
- Data classification helps to identify a so called “data owner”, the person or organizational unit which is ultimately responsible for it.
Some of the challenges to classify data include the following:
- It takes time and effort to classify each piece of information – time which cannot be spend on creating business features.
- Classification adds a bigger administrative burden to everyone that processes information in one way of another. For example, DevOps teams that deploy their applications in the cloud using IaC need to add meta-data to their resources, such as the CIA rating, their team ID and other fields.
- Often it is difficult to find the sweet spot between data protective measures and the actual costs to implement these.
What about change
Another big challenge arises when the CIA rating of an application changes while DevSecOps teams and/or other security experts are already putting their stuff in place. This “simple change” has a lot more impact than just changing a database field. Perhaps the DevSecOps team need to re-assess their cloud solution since more restrictions apply to their data in case the CIA rating goes up. Or they need to implement extra security measures to strengthen the data protection measures.
Overall, the benefits outweigh the cons so a lot of companies have already moved into the right direction.
Different types of classifications
As of today, different types of classification (schemes) exist. Focus point for every classification scheme is to answer the question to which degree an organization should guarantee the confidentiality, integrity and availability (CIA) of their data.
The following combinations of data classification are the most common:
- Public, internal only, confidential
- Standard, sensitive, confidential, secret
- Restricted data, private data and public data
- Unclassified, restricted, confidential, secret and top secret
Companies need to decide which combination to use based on features, risk profiles and implementation costs. It’s a best practice to only use one data classification scheme and not to mix them for different types of data. This makes data classification more difficult especially when it comes to the protective measures which you want to implement later on.
So how should you identify which classification level applies to your data? First of all, it’s up to the owner of the data which has a big say in it. A Product Owner who truly owns the product he/she creates with the team is responsible for it. Off course, a member of the CISO department can help to classify the data.
Step 1: assess your critical processes which heavily depend on your most valuable data. Business unit (managers) should be your most important stakeholder here. To capture sensitive information, you can use tools like Enterprise Recon. This tool discovers sensitive information in a wide range of structured and unstructured datasets. It supports on-premise data storage solutions as well as cloud based resources.
Step 2: identify data owners. This should be the business representative which benefits most of the data. And this person or department has to actually classify the data. In large organization where applications and/or data storage solutions spread around multiple teams and/or departments, this can be a big challenge.
Step 3: rules and regulations determine which requirements are relevant to process and store data. This also has an impact on the data classification. Think of the GDPR (General Data Protection Regulation) or PIA (Privacy Impact Assessment). And let’s not forget the ISO27001 (paragraph A.8.2 – information classification).
The data classification matrix
Step 4: fill in the classification matrix based on the chosen combinations (f.e public, internal only, confidential) and a number of key aspects that indicates how to treat the data. These key aspects are: access (levels), distribution, retention period, time to recover the data in case of a data-loss, reproducibility, and the so called delete policies.
Based on this, the level of protection for every Configuration Item (CI) or business application (suite) is measured using the so called CIA rating. This is a piece of meta-data that needs to be registered in the Configuration Management Data-base. It acts as the single source of truth for everyone in the organization that deals with this CI or business application (suite).
Step 5: mapping with processes. Data owners need to layout their business processes and map these with the CIA rating of the given applications. Once finished, every process is classified as with the corresponding data and other information that has to do with these processes.
Automation and tools
Where would we be without tools and automation? Nowhere! Since data classification requires a significant amount of time and effort from multiple departments in the organization, tools are of great help. They help to dig through massive amounts of structured and unstructured data at the beginning of the classification process, but they are also useful during the lifecycle of the actual data.
Azure Information Protection
Less than a year ago, Microsoft launched a new service called “Azure Information Protection“. In essence this cloud based solution applies labels to documents and emails which it discovers and classifies. There are two core functions which are relevant here:
- Use the “Unified Labeling Client” to label, classify and protect emails, different file types (f.e. through the usage of file explorer) and PowerShell scripts.
- Use the on-premises scanner to scan file repositories for sensitive content which must be classified, labeled and/or protected. This scanner also works in your Azure portal. Azure administrators are the target group for this feature.
Developers can also leverage the Microsoft Information Protection SDK to support their third party apps and other services. Examples are: label and classify files which are received from a backend service which aggregates and stores processed information or sensitive audit logs that need special protection. From an automation perspective, the SDK is of great help here.
Data classification brings great benefits to organizations which use (critical) data in their (core) processes. Without it, they can’t take the best strategic decisions and they don’t know where to put their focus on protecting it. Since the business representatives are the owner of their data, they have a big say in it. It’s important to choose a consistent data classification scheme and follow the “getting started” steps in this article. Cloud native services such as Microsoft’s’ Azure Information Protection help to (automatically) classify and protect various pieces of information in all stages of the software development life-cycle. All of this gives you enough information to get this off the ground in your organization.
If you have more questions about Data Protection, feel free to book a meeting with one of our solutions experts, mail to firstname.lastname@example.org.