ar130405 - Fotolia


Updating the data discovery process in the age of CCPA

Privacy regulations are changing the enterprise data discovery process. Now, automation is key for fulfilling data discovery mandates, including those for CCPA and GDPR.

The California Consumer Privacy Act, which came into effect on Jan. 1, 2020, requires businesses categorize personal information so it can be easily found and communicated in response to customer data use requests.

While it is not as strict as the EU's GDPR, both are based on the same principles of data visibility, transparency and accountability. And, though CCPA only applies to entities that do business in the state of California or collect data on Californians, it's unlikely there are many businesses not affected by this new legislation. Moreover, additional privacy legislation is likely to be passed in the coming months and years; it would be prudent for organizations to revisit their data discovery processes now to stay ahead of the curve.

CCPA: Redefining personal information and its collection

The CCPA's definition of personal information goes well beyond data that can be obviously associated with an identity, such as name, date of birth or Social Security number -- data collectively known as personally identifiable information. CCPA encompasses data that identifies, relates to, describes, is capable of being associated with or could reasonably be directly or indirectly linked with a consumer or household. This could include indirect information, such as product preference or geolocation data. CCPA legislation also requires organizations justify why they collect personal information and how they use it.

Creating such a broad range of personal information elements that are subject to CCPA privacy oversight has a major effect on how organizations collect and process consumers' data. Data discovery, particularly of indirect information, becomes a key compliance task. The need to protect data subject rights at scale requires a different approach to discovering and correlating data from the traditional application-only, IT-oriented start point. It will require most organizations to rethink their data discovery and inventorying processes in order to get personal data sprawl under control. This will help companies have an accurate picture of whose data they have, where it resides and how it is being used. Although CCPA doesn't explicitly require a personal information process inventory, GDPR does, and it's a sustainable and practical approach to achieving accountability and maintenance.

GDPR personal data examples
Personal information, as defined by the GDPR

How to update data discovery processes for CCPA

Application-only data inventorying typically omits the business rationale for data collection. Without this context, it's impossible to justify the data's use to the customer it is collected on. It also ignores personal information that is consumed outside of core applications. For example, hand-signed agreements, electronic documents emailed by customers as attachments and hard copies sent through the postal service all need to be inventoried as CCPA requirements extend to unstructured data and paper-based documents.

It is important to note that data discovery is not a one-off task. Because personal information is fluid, inventories can quickly become outdated unless automated scanning tools are deployed to sustain data discovery capture by recording regular snapshots of all applications and repositories where personal information resides.

CCPA data rights
CCPA consumer data rights

Automation is the only way large organizations can remain CCPA-compliant given the massive volume of data they collect across a wide range of different applications and then retain in a combination of structured and unstructured data stores in data centers and the cloud. Data privacy platforms and services, such as those from, Transcend Inc. and OneTrust LLC, can ease the burden of CCPA compliance through AI-driven data discovery, data subject request automation and document accountability, as well as provide visibility into data processing activities.

Achieving maintainable compliance also requires organizations to eliminate information and business silos. If these silos remain, creating and maintaining comprehensive process maps that ensure everyone understands the business context of the personal information of their customers will be impossible. This will also help with data subject requests -- for example, when consumers request to know what info is collected about them, if it is sold or disclosed and to whom, and more. Being able to efficiently handle data subject requests that cross multiple business units and systems will entail creating a single customer identifier that links to all inventoried personal information across an organization, and that is only going to be possible if all stakeholders are involved.

The most effective CCPA efforts will be those that work holistically with other privacy-related regulations, such as GDPR, and keep data discovery processes updated so they are able to handle data volumes and complexity as they increase over time. Without these efforts, organizations will be at risk of fines and potential class-action civil litigation if they can no longer prove transparency about how and where client information is stored.

Next Steps

Benefits of cloud data discovery tools and services multiply

4 enterprise database security best practices

Dig Deeper on Compliance

Enterprise Desktop
Cloud Computing