Browse Definitions :

data abstraction

What is data abstraction?

Data abstraction is the reduction of a particular body of data to a simplified representation of the whole. Abstraction, in general, is the process of removing characteristics from something to reduce it to a set of essential elements. To this end, data abstraction creates a simplified representation of the underlying data, while hiding its complexities and associated operations. In computing, data abstraction is commonly used in object-oriented programming (OOP) and when working with a database management system (DBMS).

What is data abstraction in OOP?

Modern programming languages that incorporate OOP methodologies commonly use data abstraction to hide the low-level details of the programming constructs that define the underlying logic, which in turn simplifies and streamlines the development process. Object-oriented programming organizes software design around data objects, rather than functions or logic. Data abstraction is a key characteristic of OOP that's implemented using classes and objects.

A class is a template definition that bundles related attributes and methods into a named package. It is a special type of object that serves as a blueprint for creating other objects. In addition, a class can include subclasses that inherit some or all the class's attributes and methods, resulting in a type of class hierarchy. Subclasses can also define their own attributes and methods.

An object is a named instance of a class that's created with specifically defined data. The object's attributes are unique from other objects based on the same class. The object serves as a simple representation of the class, abstracting the implementation details of the class itself. In this way, developers can focus primarily on the objects and how to manipulate them, rather than on how to implement the underlying logic.

Suppose, for example, that a development team is building an application for an organization's human resources department. The application contains an Employee class that includes attributes such as first name, last name, contact information, job skills and other relevant information. The Employee class also includes several methods, including one for updating an employee's list of job skills. From this class, the application can instantiate objects that have access to these properties and methods without having to recreate them for each employee record, helping to simplify the overall application development process.

Because the class provides a template for building the objects, developers can generate those objects as often as necessary without being concerned about how the class itself implements the underlying logic and without having to reconstruct the same logic each time an Employee object is needed. For example, the application can invoke the method for updating job skills without being concerned about how the update is carried out. All properties and methods are already defined, whether or not they're needed when instantiating an individual object. In this way, the underlying implementation complexities are abstracted away and hidden during the bulk of the development effort.

object-orientated programming
This diagram shows an example of the structure and naming in OOP.

What is data abstraction in a DBMS?

Data abstraction is also used with databases to hide the underlying complexities of how the data is stored and managed. The view of the data that users have is much different from the way in which the data is persisted to storage. Data abstraction, as it applies to databases, is typically divided into three layers:

  • View layer. At this level, the user sees the data as it's presented by an application interface. The user might be able to interact with the data or might only be able to view it. In either case, the user typically has access to only certain types of data, has a very limited view of the data in its entirety and has no concept of how or where the data is stored.
  • Logical layer. This layer provides a conceptual understanding of the data. It describes the type of data and how that data is related. You can think of this layer in terms of an entity relationship diagram (ERD) that lays out the data objects like a blueprint and shows how they're related.
  • Physical layer. This layer focuses on the data's physical storage. It is concerned with where and how the data is stored, how the files are managed, the type of storage and anything else related to physically maintaining the data.
Data abstraction as it applies to databases
Diagram illustrating how data abstraction typically applies to databases

There are no set rules that dictate how the data abstraction layers should be implemented. It will depend on the type of database system, how the vendor has designed the platform, how administrators have implemented that platform, how the data has been modeled and the applications used for accessing the data.

Regardless of the approach, data abstraction remains an important component of any DBMS because it makes it possible for users to easily work the data while allowing changes to occur on the back end without breaking the front-end applications. For example, a database administrator should be able to change where a database file is stored without affecting any applications that retrieve data from that file. A database developer should even be able to change a table's layout as long as an abstraction layer sits on top and can absorb the change. Data abstraction allows for the greatest flexibility in supporting changing business requirements and implementing new applications.

Learn about the difference between DBMS and RDBMS and functional versus object-oriented programming.

This was last updated in June 2022

Continue Reading About data abstraction

  • network traffic

    Network traffic is the amount of data that moves across a network during any given time.

  • dynamic and static

    In general, dynamic means 'energetic, capable of action and/or change, or forceful,' while static means 'stationary or fixed.'

  • MAC address (media access control address)

    A MAC address (media access control address) is a 12-digit hexadecimal number assigned to each device connected to the network.

  • Evil Corp

    Evil Corp is an international cybercrime network that uses malicious software to steal money from victims' bank accounts and to ...

  • Trojan horse

    In computing, a Trojan horse is a program downloaded and installed on a computer that appears harmless, but is, in fact, ...

  • quantum key distribution (QKD)

    Quantum key distribution (QKD) is a secure communication method for exchanging encryption keys only known between shared parties.

  • green IT (green information technology)

    Green IT (green information technology) is the practice of creating and using environmentally sustainable computing.

  • benchmark

    A benchmark is a standard or point of reference people can use to measure something else.

  • spatial computing

    Spatial computing broadly characterizes the processes and tools used to capture, process and interact with 3D data.

  • talent acquisition

    Talent acquisition is the strategic process employers use to analyze their long-term talent needs in the context of business ...

  • employee retention

    Employee retention is the organizational goal of keeping productive and talented workers and reducing turnover by fostering a ...

  • hybrid work model

    A hybrid work model is a workforce structure that includes employees who work remotely and those who work on site, in a company's...

Customer Experience
  • BOPIS (buy online, pick up in-store)

    BOPIS (buy online, pick up in-store) is a business model that allows consumers to shop and place orders online and then pick up ...

  • real-time analytics

    Real-time analytics is the use of data and related resources for analysis as soon as it enters the system.

  • database marketing

    Database marketing is a systematic approach to the gathering, consolidation and processing of consumer data.