Browse Definitions :
Definition

document sanitization

Document sanitization is the process of ensuring that only the intended information can be accessed from a document.

In addition to making sure the document text doesn’t openly divulge anything it shouldn’t, document sanitization includes removing document metadata that could pose a privacy or security risk. Document metadata can contain the names of authors and modifiers, the dates of creation and changes, file size, edit changes, revision histories and comment exchanges between authors and editors. Because that metadata may contain sensitive information, it's important safeguard it from unauthorized access. 

A common way to remove metadata from a document is to convert it to PDF format before releasing it; however, there are processes that must be followed to ensure the document contains no unintended information.  The National Security Agency (NSA) recommends the following six-step process for secure conversion and redaction of Word documents:

  1. Create a copy of the original document.
  2. Turn off “Track Changes” on the copy and remove all visible comments.
  3. Delete any sensitive information from the document that you wish to redact.
  4. Use the Microsoft Office Document Inspector to check for any unwanted metadata.
  5. Save the new document and convert it to a PDF file.
  6. Use the Sanitize Document tool in Acrobat Professional as a second check before releasing the redacted PDF.

See also: metadata management, metadata security

This was last updated in August 2014

Continue Reading About document sanitization

SearchNetworking
SearchSecurity
  • man in the browser (MitB)

    Man in the browser (MitB) is a security attack where the perpetrator installs a Trojan horse on the victim's computer that is ...

  • Patch Tuesday

    Patch Tuesday is the unofficial name of Microsoft's monthly scheduled release of security fixes for the Windows operating system ...

  • parameter tampering

    Parameter tampering is a type of web-based cyber attack in which certain parameters in a URL are changed without a user's ...

SearchCIO
  • e-business (electronic business)

    E-business (electronic business) is the conduct of business processes on the internet.

  • business resilience

    Business resilience is the ability an organization has to quickly adapt to disruptions while maintaining continuous business ...

  • chief procurement officer (CPO)

    The chief procurement officer, or CPO, leads an organization's procurement department and oversees the acquisitions of goods and ...

SearchHRSoftware
SearchCustomerExperience
  • first call resolution (FCR)

    First call resolution (FCR) is when customer service agents properly address a customer's needs the first time they call.

  • customer intelligence (CI)

    Customer intelligence (CI) is the process of collecting and analyzing detailed customer data from internal and external sources ...

  • clickstream data (clickstream analytics)

    Clickstream data and clickstream analytics are the processes involved in collecting, analyzing and reporting aggregate data about...

Close