Browse Definitions :
Definition

document sanitization

Document sanitization is the process of ensuring that only the intended information can be accessed from a document.

In addition to making sure the document text doesn’t openly divulge anything it shouldn’t, document sanitization includes removing document metadata that could pose a privacy or security risk. Document metadata can contain the names of authors and modifiers, the dates of creation and changes, file size, edit changes, revision histories and comment exchanges between authors and editors. Because that metadata may contain sensitive information, it's important safeguard it from unauthorized access. 

A common way to remove metadata from a document is to convert it to PDF format before releasing it; however, there are processes that must be followed to ensure the document contains no unintended information.  The National Security Agency (NSA) recommends the following six-step process for secure conversion and redaction of Word documents:

  1. Create a copy of the original document.
  2. Turn off “Track Changes” on the copy and remove all visible comments.
  3. Delete any sensitive information from the document that you wish to redact.
  4. Use the Microsoft Office Document Inspector to check for any unwanted metadata.
  5. Save the new document and convert it to a PDF file.
  6. Use the Sanitize Document tool in Acrobat Professional as a second check before releasing the redacted PDF.

See also: metadata management, metadata security

This was last updated in August 2014

Continue Reading About document sanitization

Networking
  • network traffic

    Network traffic is the amount of data that moves across a network during any given time.

  • dynamic and static

    In general, dynamic means 'energetic, capable of action and/or change, or forceful,' while static means 'stationary or fixed.'

  • MAC address (media access control address)

    A MAC address (media access control address) is a 12-digit hexadecimal number assigned to each device connected to the network.

Security
  • Trojan horse

    In computing, a Trojan horse is a program downloaded and installed on a computer that appears harmless, but is, in fact, ...

  • quantum key distribution (QKD)

    Quantum key distribution (QKD) is a secure communication method for exchanging encryption keys only known between shared parties.

  • Common Body of Knowledge (CBK)

    In security, the Common Body of Knowledge (CBK) is a comprehensive framework of all the relevant subjects a security professional...

CIO
  • benchmark

    A benchmark is a standard or point of reference people can use to measure something else.

  • spatial computing

    Spatial computing broadly characterizes the processes and tools used to capture, process and interact with 3D data.

  • organizational goals

    Organizational goals are strategic objectives that a company's management establishes to outline expected outcomes and guide ...

HRSoftware
  • talent acquisition

    Talent acquisition is the strategic process employers use to analyze their long-term talent needs in the context of business ...

  • employee retention

    Employee retention is the organizational goal of keeping productive and talented workers and reducing turnover by fostering a ...

  • hybrid work model

    A hybrid work model is a workforce structure that includes employees who work remotely and those who work on site, in a company's...

Customer Experience
  • database marketing

    Database marketing is a systematic approach to the gathering, consolidation and processing of consumer data.

  • cost per engagement (CPE)

    Cost per engagement (CPE) is an advertising pricing model in which digital marketing teams and advertisers only pay for ads when ...

  • B2C (Business2Consumer or Business-to-Consumer)

    B2C -- short for business-to-consumer -- is a retail model where products move directly from a business to the end user who has ...

Close