Sergio Bertoni, leading analyst at SearchInform, unpacks what DCAP systems are, and how they can be used to protect data.

 

DCAP for dummies

On the one hand, various DCP/DAG systems aren’t magic unicorns. A lot of companies use them on the daily basis. The first variants of such systems can be dated by 2006. Varonis Systems released DatAdvantage and became one of the first companies in the field of DCAP/DAG solutions.

The term DCAP was introduced by Gartner in a faraway 2017. On the other hand, currently there are a lot of different DCAP systems, but still there is no unity about the must-have list of capabilities for such tools. If you are brave enough, you can find an information security specialist and ask him about differences and similarities between DCAP, DAG and DPSM tools.

I mean, there is one main idea – the necessity to protect data. While there is no unity about how to effectively achieve it. There are different ways to achieve it, so let’s take a closer look at the main ideas behind Data Centric Audit and Protection solutions.

 

Yin and Yang of DCAP systems

In the beginning was the Word, and the Word was data. Sources of data are the foundation for DCAP systems. Data sources can be in different forms, such as endpoints, cloud storage, mail servers, SANs, DBMSs, controllers, etc. The nature of the DCAP system is determined by the architecture of sources—they can be network-based or agent-based.

The most common type of DCAP system is network-based. They connect to SANs by LAN protocols, scan their data, browse log files, analyze file parameters, and file access policies. Usually, such DCAP systems are working with external data sources – network folders, cloud storage, external file logs, etc.

Agent-based DCAP systems are less commonly used. Their main strength is agents. Agents are software elements that are installed in the selected data storage and embedded with the storage operating system. They collect detailed data about files, file operations, user activity, and user access rights. There is the limitation—agents can function only inside of physical infrastructure, where they can locally interact with storage on endpoints and on-premises platforms.

The architecture of the DCAP system forms its capabilities and limits. Let’s take a closer look at their particular qualities on a step-by-step basis.

 

Storage audit

The foundation of DCAP systems is storage audit. The bigger amount of data storage is monitored, the more effective the DCAP system is. In a perfect world, the DCAP system would have access to everything: local storage, LAN storage, user’s access rights, and history of file operations.

It can be achieved with the help of various agents for different operating systems, and the support of dozens of communication protocols, file systems, integration with mail servers, domain controllers, cloud services, etc. With all of these tools, it will be a perfect, well-rounded system. In practice, such solutions are quite rare. Usually, every system has its own pros and cons.

Network-based DCAP systems de jure could monitor more data channels. They use protocols such as NFS, FTP, SMB, etc. Each one of them has the capability to connect with a different class of data storage – cloud, file server, etc. Thus, the DCAP system won’t be limited by the platform’s operating system’s limitations.

However, information about files won’t be complete. For example, a history of file processing won’t be available without the help of additional tools. Moreover, network-based DCAP is depended on the status of the network operation. Because real-time file monitoring is carried out on the side of application machine.

Agent-based DCAP systems are limited by a narrower pool of data sources than network-based ones. Firstly, agents can’t be installed in a cloud storage or on a mail server. Secondly, the agent is integrated with the operating system. It means that there can be some functional difference between various OSes.

Still, agent-based DCAP systems have an important advantage. They have more technical capabilities. Such systems can independently collect data and write event logs – history of file operations, user actions log, etc. Basically, they are independent entities that are gathering data by themselves. It allows them to collect complex and comprehensive information about storage status during the file audit.

And last but not the least, agent-based DCAP systems have direct access to the files.

 

Data classification

The main goal of DCAP systems is effective classification of founded data. Decisions about access rights policies are based on the foundation of data classification. Information security specialists review data analysis results and decide how hard each type of data will be protected.

In order to categorize a file, the DCAP system has to read its content. DCAP systems are using different types of parsers for different storage types and data formats. Moreover, OCR and Speech-To-Text services must be integrated to the DCAP system in order to analyze audio and graphic files.

On the next step, DCAP systems are using analytical tools. Data classification is based on the search engine with different content analysis technologies such as analysis by attributes, dictionaries, regular expressions, semantic similarity, etc. Thus, the DCAP system is able to categorize files by their content, be it phone numbers, blueprints, financial documents, etc.

Further, data is classified in accordance with type of content. The DCAP system marks files with labels such as PII, commercial contracts, or trade secrets and recommends particular types of access rights (confidential, common, etc).

Such labels can be in the form of descriptions in the DCAP database. It is the approach of network-based DCAP systems. Otherwise, they can be written into file metadata or embedded directly into the file. This distinction between two groups of DCAP systems defines how they approach the goal of data protection.

 

Data protection

Management of access rights is one of the most demanded features of DCAP systems. The system must be able to prevent user actions, which can be potentially threatful. It will prevent such operations as data transfer, editing, and deletion.

Access rights management can be implemented differently. For example, operating systems and file systems are doing it on an attribute basis. In such cases, access can be limited to particular files, folders, and operations with them for specified users or user groups. But these limitations can be potentially bypassed if users have administrators’ rights. Otherwise, the user can change file attributes, copy file content to another file, or transfer the file to a different folder.

This is where the DCAP system is coming into play. It manages access rights for all files with the same content. The file system driver reads the file label and requests instructions from the DCAP system in order to proceed the file. Such instructions follow the next scheme:

  • What group of files (by label/attribute)
  • Who is affected (user/user group)
  • Where are located affected files (selected endpoint/on storage)
  • What group of applications is affected (for particular software, be it Word, messenger, ERP, or tailor-made application, or for any application)
  • Is allow or disallowed to operate

Thereby, instructions can be created for the following scheme. User group “DevOps” and user X cannot operate files with the label “commercial contract” in any way. Another instruction can be—files with the label “PII” cannot be opened in mail services, messengers, or cloud storage.

It is worth noting that even if file content is copied to another file, the label will still be in place. Label is content-dependent and will be applied to the recipient file. This is the capability of agent-based DCAP systems. Network-based systems doesn’t act in this way.

 

Equilibrium point of DCAP systems

There are flaws and cons for both types of DCAP systems. Network-based solutions are monitoring a wider range of data sources than agent-based ones. The latter excels in data protection, but only in local storage. Meanwhile, information security specialists want a system that will have the strengths of both approaches. Some kind of hybrid DCAP system that will use both approaches.

That’s the way of development of our SearchInform FileAuditor. It can be deployed on the endpoints and file servers. I would like to highlight that FileAuditor can be integrated with Windows and Linux-based platforms. Agent embeds on the driver level and marks files with labels in the alternative file stream. It’s our patented technology.

At the same time, FileAuditor supports all common network protocols, such as SSH, SMB, FTP, HTTPS, WebDAV, etc. Thus, it can monitor net folders, SANs, and cloud storage in addition to local storage infrastructure. Moreover, SearchInform FileAuditor can supervise Exchange mail servers, Huawei OceanStor SANs, and NetApp storage.

In theory, a hybrid DCAP system is a jack-of-all-trades, master of everything. In practice, there are several issues. Access rights management can’t be implemented via network, while agents can’t be integrated with cloud storage. We are doing our best to solve these issues. In the current moment, we are looking at the direction of potential integrations.

First possible upgrade – the DCAP system can transfer information from other services.

For example, let’s take a look at the file operations. Any SAN logs them despite the OS and file system type. In the case of window-based SANs, the DCAP system has to read Windows journals. The same logic applies to specific file systems, like NetApp’s storage system. We decided to connect these journals to the DCAP system as additional sources of information. As a result, FileAuditor reads them and then matches journal data with information about file operations from SAN. The same goes for ActiveDirectory. In the same way, FileAuditor gathers information about the history of an account’s management and changes to users’ access rights and passwords.

The second important point of development is integration with other security solutions.

Technical issues can be solved with the help of creative solutions. For example, FileAuditor can’t ensure the application of access rights management to cloud storage. However, this is possible for a DLP system. FileAuditor can be seamlessly integrated with Risk Monitor, SearchInform’s DLP solution. The former shares information about file label with the DLP system, while the latter manages data in accordance to received information. In this way, DLP management can provide robust and comprehensive data protection for cloud storage. As a result, all information from various data sources will be classified, and data protection will be ensured.

 

To sum up

Every company has its own set of demands and needs for DCAP systems capabilities. One will prioritize data audits, while the other will emphasize the importance of access rights management. In the real world, you can’t find a perfect and ideal DCAP system.

At SearchInform, we think that a hybrid approach is the way to go. In the upcoming years, more and more companies will adopt exactly this paradigm. Because the hybrid approach unites the main strengths of network-based solutions and agent-based solutions. It integrates a long list of monitored data sources with depth and reliability of data protection.