Browse Definitions :
Definition

document sanitization

Document sanitization is the process of ensuring that only the intended information can be accessed from a document.

In addition to making sure the document text doesn’t openly divulge anything it shouldn’t, document sanitization includes removing document metadata that could pose a privacy or security risk. Document metadata can contain the names of authors and modifiers, the dates of creation and changes, file size, edit changes, revision histories and comment exchanges between authors and editors. Because that metadata may contain sensitive information, it's important safeguard it from unauthorized access. 

A common way to remove metadata from a document is to convert it to PDF format before releasing it; however, there are processes that must be followed to ensure the document contains no unintended information.  The National Security Agency (NSA) recommends the following six-step process for secure conversion and redaction of Word documents:

  1. Create a copy of the original document.
  2. Turn off “Track Changes” on the copy and remove all visible comments.
  3. Delete any sensitive information from the document that you wish to redact.
  4. Use the Microsoft Office Document Inspector to check for any unwanted metadata.
  5. Save the new document and convert it to a PDF file.
  6. Use the Sanitize Document tool in Acrobat Professional as a second check before releasing the redacted PDF.

See also: metadata management, metadata security

This was last updated in August 2014

Continue Reading About document sanitization

Networking
  • network management system

    A network management system, or NMS, is an application or set of applications that lets network engineers manage a network's ...

  • host (in computing)

    A host is a computer or other device that communicates with other hosts on a network.

  • Network as a Service (NaaS)

    Network as a service, or NaaS, is a business model for delivering enterprise WAN services virtually on a subscription basis.

Security
  • WebAuthn API

    The Web Authentication API (WebAuthn API) is a credential management application program interface (API) that lets web ...

  • Common Vulnerability Scoring System (CVSS)

    The Common Vulnerability Scoring System (CVSS) is a public framework for rating the severity of security vulnerabilities in ...

  • Dridex malware

    Dridex is a form of malware that targets victims' banking information, with the main goal of stealing online account credentials ...

CIO
  • audit program (audit plan)

    An audit program, also called an audit plan, is an action plan that documents what procedures an auditor will follow to validate ...

  • blockchain decentralization

    Decentralization is the distribution of functions, control and information instead of being centralized in a single entity.

  • outsourcing

    Outsourcing is a business practice in which a company hires a third party to perform tasks, handle operations or provide services...

HRSoftware
  • team collaboration

    Team collaboration is a communication and project management approach that emphasizes teamwork, innovative thinking and equal ...

  • employee self-service (ESS)

    Employee self-service (ESS) is a widely used human resources technology that enables employees to perform many job-related ...

  • learning experience platform (LXP)

    A learning experience platform (LXP) is an AI-driven peer learning experience platform delivered using software as a service (...

Customer Experience
  • market segmentation

    Market segmentation is a marketing strategy that uses well-defined criteria to divide a brand's total addressable market share ...

  • sales pipeline

    A sales pipeline is a visual representation of sales prospects and where they are in the purchasing process.

  • market basket analysis

    Market basket analysis is a data mining technique used by retailers to increase sales by better understanding customer purchasing...

Close