Xerox is developing a new technology, intelligent redaction, which enables users to selectively encrypt different portions of a given document for different recipients.
The software can take into account the context of words or phrases in a document and determine when to redact the content and when to let it go. The new technology could be of great use to financial services companies, healthcare providers and other organisations that must deal with confidential information on a mass scale.
Employees in these organisations often face situations in which they must either restrict the circulation of certain information or redact large amounts of it in order to comply with the lowest common denominator of access privileges. Intelligent redaction could allow for wider distribution of information with higher security at the same time.
Using the new software, the author of a document can identify sensitive portions of the text and then allow the software to encrypt or leave as plain text those portions, depending on the reader.
The system also identifies sensitive content, such as employee identification numbers, names or ID numbers, and then allows the author to select the text that does need to be redacted.
The idea is to automate as much of the redaction process as possible, while still allowing users the final say in which recipients can read which parts of the document.
"The system makes use of some natural language processing, but humans remain involved," says Rob Abraham, MD of Bytes Document Solutions, sole distributor of Xerox in 24 sub-Saharan countries. "You can write a rule and apply it and see the effects of it and then fine-tune it. You can apply rules at the word level or even at the sentence or paragraph levels."
The security and privacy research group at Xerox's Palo Alto Research Center (PARC) is leading the development of the new software.
In the process of developing the software, the PARC researchers spoke to potential users in a number of different fields, including lawyers, medical records clerks and others. What they found was that people handle the redaction process in different ways, depending upon their roles and the kind of information in question.
"We found that people in the medical field often have to respond to subpoenas for medical records and there are some classes of information that they have to redact, like HIV status and any information on psychiatric care," says Jessica Staddon, area manager of the research group.
"These organisations were maintaining manual lists of medications and other information that should be redacted. It was a very manual, laborious process. If our software was running locally on someone's PC, it could automatically generate the list of these terms that need to be redacted and help improve the speed and accuracy of the process."
The usage model is considerably different in a setting such as a law office, Staddon says. In most cases, the redaction process for legal documents is a collaborative effort that may involve a junior lawyer who does the initial review, a subject-matter expert and perhaps a senior partner. In that case, the intelligent redaction software might run on a central server instead of users' machines.
The technology is still under development.