What is the definition of data abstraction, and what are the primary data abstraction layers we should be concerned about?
Data abstraction is amazingly useful because it allows humans to understand and build complex systems like databases.
A good place to start understanding the definition of data abstraction is to think about the way the word 'abstract' is used when we talk about a long document. The abstract is the shortened, simplified form. We often read it to get an overview before reading the entire paper. (Actually we often read it INSTEAD of reading the paper, but that's another issue.)
Now think about designing a brand new, complex database system. One way to design the database is to sit down and start writing code. You start at one end and keep writing until the database is finished. In practice there are very few (probably zero) people who can do this; the problem is too complex.
So, we create a simple written description of the database that the users have asked for, which is an abstraction of the database. Then we add a little detail to that, making it into another abstraction, but a somewhat more detailed one. We keep on adding detail until finally it is finished. There is no more detail to add; the abstraction has turned into the thing itself.
The three formal abstraction layers we usually use are:
- User model: How the user describes the database
- Logical model: More formal, more detail – often rendered as an entity relationship (ER) model
- Physical model: More geeky detail added – indexing, data types etc.
Data abstraction is simply a way of turning a complex problem into a manageable one.