Chapter 4 - Data Modeling Using ER Model

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Fundamentals of Database System

Chapter Four
Data Modeling Using the Entity-Relationship (ER) Model
4.1 Database Design
Database design is the process of coming up with different kinds of specification for the data to be stored
in the database. The database design part is one of the middle phases we have in information systems
development where the system uses a database approach. Design is the part on which we would be
engaged to describe how the data should be perceived at different levels and finally how it is going to be
stored in a computer system.
The ability to design databases and associated applications is critical to the success of the modern
enterprise. Database design requires understanding both the operational and business requirements of an
organization as well as the ability to model and realize those requirements using a database.
Developing database and information systems is performed using a development lifecycle, which
consists of a series of steps. As it is one component in most information system development tasks, there
are several steps to follow in designing a database system.
Information System with Database application consists of several tasks which include:
 Planning of Information systems Design
 Requirements Analysis,
 Design (Conceptual, Logical and Physical Design)
 Tuning
 Implementation
 Operation and Support
The requirements gathering and specification provides you with a high-level understanding of the
organization, its data, and the processes that you must model in the database. Database design involves
constructing a suitable model of this information. Since the design process is complicated, especially for
large databases, database design is mainly focused on this three phases:
1. Conceptual Design
2. Logical Design, and
3. Physical Design
In general, one has to go back and forth between these tasks to refine a database design, and decisions in
one task can influence the choices in another task.

AASTU Compiled by Chere L. (M.Tech) Page 1


Fundamentals of Database System
The Three levels of Database Design

Conceptual Database Design


Conceptual design is the process of constructing a model of the information used in an enterprise,
independent of any physical considerations.
 The process of constructing a model of the information used in an enterprise.
 It used as input or source of information for the logical design phase.
 Mostly uses an Entity Relationship Model to describe the data at this level.
 It is a phase which is independent of all physical considerations (DBMS, OS, . . . ).
Conceptual design revolves around discovering and analyzing organizational and user data
requirements. The important activities are to identify
 Entities
 Attributes
 Relationships
 Constraints
And based on these components develop the ER model using
 ER diagrams
After the completion of Conceptual Design one has to go for refinement of the schema, which is
verification of Entities, Attributes, and Relationships
In developing a good design, one should answer such questions as:
 What are the relevant Entities for the Organization
 What are the important features of each Entity
 What are the important Relationships
 What are the important queries from the user
 What are the constraints(business rules) that (must) hold for entities and relationships?
 What are the other requirements of the Organization and the Users

AASTU Compiled by Chere L. (M.Tech) Page 2


Fundamentals of Database System
Reasons for conceptual modeling
 Helps users and system developers to identify data requirements (abstract model)
 Helps in understanding how existing systems can be modified/maintained
 Allows for easy communication between end-users and developers.
 Independent of dbms or any os.
 Has a clear method to convert from high-level model to relational model.
 It is a permanent description of the database requirements.

Logical Database Design


Logical design is the process of constructing a model of the information used in an enterprise based on a
specific data model (e.g. relational, hierarchical or network or object), but independent of a particular
DBMS and other physical considerations.
 Collection of Rules to be maintained
 Discover new entities in the process
 Revise attributes based on the rules and the discovered Entities

Physical Database Design


Physical design is the process of producing a description of the implementation of the database on
secondary storage. -- defines specific storage or access methods used by database
 Describes the storage structures and access methods used to achieve efficient access to data.
 Tailored to a specific DBMS system -- Characteristics are function of DBMS and operating
systems
 Includes estimate of storage space

4.2 The Entity Relationship (E-R) Model


ER model is the graphical representation of entities and their relationships in a database structure that
quickly became popular because it complemented relational data model concepts. Entity-Relationship
modeling is used to represent conceptual view of database. The main components of ER Modeling are:
(a) Entities
 An entity is defined as anything about which data are to be collected and stored.
 Represented in the ERD by a rectangle, also known as an entity box.
 The name of the entity, a noun, is written in the center of the rectangle.
 The entity name is generally written in capital letters and is written in the singular form:
PAINTER rather than PAINTERS, and EMPLOYEE rather than EMPLOYEES.

AASTU Compiled by Chere L. (M.Tech) Page 3


Fundamentals of Database System
 Usually, when applying the ERD to the relational model, an entity is mapped to a relational
table. i.e. Corresponds to entire table, not row
 Each row in the relational table is known as an entity instance or entity occurrence in the ER
model.
 Each entity is described by a set of attributes that describes particular characteristics of the entity.
 For example, the entity EMPLOYEE will have attributes such as a Social Security number, a last
name, and a first name.
Examples of Entities
Persons: agency, contractor, customer, department, division, employee, instructor, student,
supplier.
Places: sales region, building, room, branch office, campus.
Objects: book, machine, part, product, raw material, software license, software package, tool,
vehicle model, vehicle.
Events: application, award, cancellation, class, flight, invoice, order, registration, renewal,
requisition, reservation, sale, trip.

(b) Attributes
 Are properties used to describe each Entity or real world object.
 Are used to store pieces of information about entities.
 Attributes will give rise to recorded items of data in the database
 For example, the STUDENT entity includes, among many others, the attributes STU_LNAME,
STU_FNAME, and STU_INITIAL.
 In the original Chen notation, attributes are represented by ovals and are connected to the entity
rectangle with a line.

(c) Relationships
 Relationships describe associations among data (exist between entities).
 Most relationships describe associations between two entities.
 Relationship (relationship type) is a meaningful association among entity types.
 Generally, a relationship is represented as a connection between (or among) entities.
 In standard ER model, it uses a diamond shape to connect between (or among) entities.

 The relationship name is an active or passive verb; for example, a STUDENT takes a CLASS,
a PROFESSOR teaches a CLASS, a DEPARTMENT employs a PROFESSOR, a
DIVISION is managed by an EMPLOYEE.

AASTU Compiled by Chere L. (M.Tech) Page 4


Fundamentals of Database System
 There are several type of relationships based on the degree, cardinality, and participation.

 The entities that participate in a relationship are also known as participants, and each
relationship is identified by a name that describes the relationship.

When the basic data model components were introduced, three types of relationships among data were
illustrated:
 One-to-Many (1:M)
 Many-to-Many (M:N), and
 One-to-One (1:1)
The ER model uses the term connectivity to label the relationship types.
 The name of the relationship is usually an active or passive verb.
 For example, a PAINTER paints many PAINTINGs; an EMPLOYEE learns many SKILLs;
an EMPLOYEE manages a STORE.

(d) Constraints:- Represent the constraint in the data

Before working on the conceptual design of the database, one has to know and answer the following
basic questions.
• What are the entities and relationships in the enterprise?
• What information about these entities and relationships should we store in the database?
• What is the integrity constraints that hold? Constraints on each data with respect to update,
retrieval and store.
• Represent this information pictorially in ER diagrams, then map ER diagram into a relational
schema.

4.3 Developing an E-R Diagram


Designing conceptual model for the database is not a one linear process but an iterative activity where
the design is refined again and again.
To identify the entities, attributes, relationships, and constraints on the data, there are different set of
methods used during the analysis phase. These include information gathered by…
 Interviewing end users individually and in a group
 Questionnaire survey
 Direct observation
 Examining different documents

AASTU Compiled by Chere L. (M.Tech) Page 5


Fundamentals of Database System
The basic E-R model is graphically depicted and presented for review. The process is repeated until the
end users and designers agree that the E-R diagram is a fair representation of the organization’s
activities and functions.
Checking for Redundant Relationships in the ER Diagram. Relationships between entities indicate
access from one entity to another - it is therefore possible to access one entity occurrence from another
entity occurrence even if there are other entities and relationships that separate them - this is often
referred to as Navigation' of the ER diagram
The last phase in ER modeling is validating an ER Model against requirement of the user.

Graphical Representations in ER Diagramming


• Entity is represented by a RECTANGLE containing the name of the entity.

• Connected entities are called relationship participants


• Attributes are represented by OVALS and are connected to the entity by a line.

• A derived attribute is indicated by a DOTTEDLINE. (……..)

Ovals

• Primary keys are underlined.

Key

• Partial keys are dotted lined.

Key

AASTU Compiled by Chere L. (M.Tech) Page 6


Fundamentals of Database System
• Relationships are represented by DIAMOND shaped symbols
 Weak Relationship is a relationship between Weak and Strong Entities
 Strong Relationship is a relationship between two strong Entities

AASTU Compiled by Chere L. (M.Tech) Page 7


Fundamentals of Database System
Example 1: Build an ER Diagram for the following information:
A student record management system will have the following two basic data object categories with their
own features or properties: Students will have an Id, Name, Dept, Age, GPA and Course will have an Id,
Name, Credit Hours. Whenever a student enroll in a course in a specific Academic Year and Semester,
the Student will have a grade for the course

Example 2: Build an ER Diagram for the following information:


A Personnel record management system will have the following two basic data object categories with
their own features or properties: Employee will have an Id, Name, DoB, Age, Tel and Department will
have an Id, Name, Location. Whenever an Employee is assigned in one Department, the duration of his
stay in the respective department should be registered.

Example 3: Build an ER Diagram for the following information:


A company database needs to store information about employees (identifyied by ssn, with salary and
phone as attributes); departments (identified by dno, with dname and budget as attributes); and children
of employees (with name and age as attributes). Employees work in departments; each department is
managed by an employee; a child must be identified uniquely by name when the parent (who is an
employee; assume that only one parent works for the company) is known. We are not interested in
information about a child once the parent leaves the company.

AASTU Compiled by Chere L. (M.Tech) Page 8


Fundamentals of Database System
4.4 Structural Constraints on Relationship
Relationship types usually have certain constraints that limit the possible combinations of entities that
may participate in the corresponding relationship set. These constraints are determined from the mini-
world situation that the relationships rep-resent. For example, One company may has a rule that each
employee must work for exactly one department, then we would like to describe this constraint in the
schema. We can distinguish two main types of relationship constraints: cardinality ratio and
participation.

(a) Cardinality Ratio (Multiplicity) Constraints


Multiplicity constraint is the number or range of possible occurrence of an entity type/relation that may
relate to a single occurrence/tuple of an entity type/relation through a particular relationship. As general
it specifies the maximum number of relationship instances that an entity can participate in.
For example, in the WORKS_FOR binary relationship type, DEPARTMENT : EMPLOYEE is of
cardinality ratio 1:N, meaning that each department can be related to (that is, employs) any number of
employees, but an employee can be related to (work for) only one department. This means that for this
particular relationship WORKS_FOR, a particular department entity can be related to any number of
employees (N indicates there is no maximum number). On the other hand, an employee can be related to
a maximum of one department.
The possible cardinality ratios for binary relationship types are 1:1, 1:N, N:1, and M:N. The cardinality
ratio mostly used to insure appropriate enterprise constraints.
One-to-one relationship (1:1)
• A customer is associated with at most one loan via the relationship borrower
• A loan is associated with at most one customer via borrower

E.g.: Relationship Manages between STAFF and BRANCH


The multiplicity of the relationship is:
 One branch can only have one manager
 One employee could manage either one or no branches

AASTU Compiled by Chere L. (M.Tech) Page 9


Fundamentals of Database System
One-To-Many Relationships
• An entity on one side of the relationship can have many related entities, but an entity on the other
side will have a maximum of one related entity
• In the one-to-many relationship a loan is associated with at most one customer via borrower, a
customer is associated with several (including 0) loans via borrower

E.g.: Relationship Leads between STAFF and PROJECT


The multiplicity of the relationship
 One staff may Lead one or more project(s)
 One project is Lead by one staff

Many-To-Many Relationship (Sometimes called non-specific)


When for one instance of entity A, there are zero, one, or many instances of entity B and for one
instance of entity B there are zero, one, or many instances of entity A.
An example is: employees can be assigned to no more than two projects at the same time; projects must
have assigned at least three employees. A single employee can be assigned to many projects; conversely,
a single project can have assigned to it many employee.
Here the cardinality for the relationship between employees and projects is two and the cardinality
between project and employee is three.
Many-to-many relationships cannot be directly translated to relational tables but instead must be
transformed into two or more one-to-many relationships using associative entities.
• A customer is associated with several (possibly 0) loans via borrower
• A loan is associated with several (possibly 0) customers via borrower

AASTU Compiled by Chere L. (M.Tech) Page 10


Fundamentals of Database System
E.g.: Relationship Teaches between INSTRUCTOR and COURSE
The multiplicity of the relationship
• One Instructor Teaches one or more Course(s)
• One Course Thought by Zero or more Instructor(s)

(b) Participation of an Entity Set in a Relationship Set


Recall that relationships are bidirectional; that is, they operate in both directions. For instance, if
COURSE is related to CLASS, then by definition, CLASS is related to COURSE. Because of the
bidirectional nature of relationships, it is necessary to determine the connectivity of the relationship
from COURSE to CLASS and the connectivity of the relationship from CLASS to COURSE. Similarly,
the specific maximum and minimum cardinalities must be determined in each direction for the
relationship. Once again, you must consider the bidirectional nature of the relationship when
determining participation.
The participation constraint specifies whether the existence of an entity depends on its being related to
another entity via the relationship type. This constraint specifies the minimum number of relationship
instances that each entity can participate in, and is some-times called the minimum cardinality
constraint.
Participation constraint of a relationship is involved in identifying and setting the mandatory or optional
feature of an entity occurrence to take a role in a relationship. There are two distinct participation
constraints with this respect, namely: Total Participation and Partial Participation

Total participation:
Every tuple in the entity or relation participates in at least one relationship by taking a role. This means,
every tuple in a relation will be attached with at least one other tuple. The entity with total participation
in a relationship will be connected to the relationship using a double line. The existence of a mandatory
relationship indicates that the minimum cardinality is at least 1 for the mandatory entity.
Let’s examine a few more scenarios. Suppose that Tiny College employs some professors who
conduct research without teaching classes.
If you examine the “PROFESSOR teaches CLASS” relationship, it is quite possible for a
PROFESSOR not to teach a CLASS. Therefore, CLASS is optional to PROFESSOR. On the
other hand, a CLASS must be taught by a PROFESSOR. Therefore, PROFESSOR is mandatory
to CLASS

AASTU Compiled by Chere L. (M.Tech) Page 11


Fundamentals of Database System
Partial participation:
Some tuple in the entity or relation may not participate in the relationship. This means, there is at least
one tuple from that Relation not taking any role in that specific relationship. The entity with partial
participation in a relationship will be connected to the relationship using a single line.
For example, in the “COURSE generates CLASS” relationship, you noted that at least some
courses do not generate a class. In other words, an entity occurrence in the COURSE table does
not necessarily require the existence of a corresponding entity occurrence in the CLASS table.
(Remember that each entity is implemented as a table.) Therefore, the CLASS entity is
considered to be optional to the COURSE entity. The existence of an optional entity indicates
that the minimum cardinality is 0 for the optional entity. (The term optionality is used to label
any condition in which one or more optional relationships exist.)
E.g. 1: Participation of EMPLOYEE in “belongs to” relationship with DEPARTMENT is total since
every employee should belong to a department.
Participation of DEPARTMENT in “belongs to” relationship with EMPLOYEE is total since every
department should have more than one employee.

E.g. 2: Participation of EMPLOYEE in “manages” relationship with sDEPARTMENT, is partial


participation since not all employees are managers.
Participation of DEPARTMENT in “Manages” relationship with EMPLOYEE is total since every
department should have a manager.

4.5 Problem in ER Modeling


The Entity-Relationship Model is a conceptual data model that views the real world as consisting of
entities and relationships. The model visually represents these concepts by the Entity-Relationship
diagram. The basic constructs of the ER model are entities, relationships, and attributes. Entities are
concepts, real or abstract, about which information is collected. Relationships are associations between
the entities. Attributes are properties which describe the entities.
While designing the ER model one could face a problem on the design which is called a connection
traps. Connection traps are problems arising from misinterpreting certain relationships
There are two types of connection traps; fan trap and chasm traos

AASTU Compiled by Chere L. (M.Tech) Page 12


Fundamentals of Database System
1. Fan trap:
Occurs where a model represents a relationship between entity types, but the pathway between certain
entity occurrences is ambiguous.
May exist where two or more one-to-many (1:M) relationships fan out from an entity. The problem
could be avoided by restructuring the model so that there would be no 1:M relationships fanning out
from a single entity and all the semantics of the relationship is preserved.
Example:

Semantics description of the problem;

Problem: Which car (Car1 or Car3 or Car5) is used by Employee 6 Emp6 working in Branch 1 (Bra1)?
Thus from this ER Model one cannot tell which car is used by which staff since a branch can have more
than one car and also a branch is populated by more than one employee. Thus we need to restructure the
model to avoid the connection trap.
To avoid the Fan Trap problem we can go for restructuring of the E-R Model. This will result in the
following E-R Model.

Semantics description of the problem;

AASTU Compiled by Chere L. (M.Tech) Page 13


Fundamentals of Database System
2. Chasm Trap:
Occurs where a model suggests the existence of a relationship between entity types, but the path way
does not exist between certain entity occurrences.
May exist when there are one or more relationships with a minimum multiplicity on cardinality of zero
forming part of the pathway between related entities.
Example:

If we have a set of projects that are not active currently then we can not assign a project manager for
these projects. So there are project with no project manager making the participation to have a minimum
value of zero.
Problem:
How can we identify which BRANCH is responsible for which PROJECT? We know that whether the
PROJECT is active or not there is a responsible BRANCH. But which branch is a question to be
answered, and since we have a minimum participation of zero between employee and PROJECT we
can’t identify the BRANCH responsible for each PROJECT.
The solution for this Chasm Trap problem is to add another relation ship between the extreme entities
(BRANCH and PROJECT)

Example;
The company is organized into departments. Each department has a unique name, a unique number, and
a particular employee who manages the department. We keep track of the start date when that employee
began managing the department. A department may have several locations. A department controls a
number of projects, each of which has a unique name, a unique number, and a single location.
We store each employee’s name, Social Security number, address, salary, sex(gender), and birth date.
An employee is assigned to one department, but may work on several projects, which are not necessarily
controlled by the same department. We keep track of the current number of hours per week that an
employee works on each project. We also keep track of the direct supervisor of each employee (who is
another employee). We want to keep track of the dependents of each employee for insurance purposes.
We keep each dependent’s first name, sex, birth date, and relation-ship to the employee

AASTU Compiled by Chere L. (M.Tech) Page 14


Fundamentals of Database System
We can now define the entity types for the COMPANY database, based on the requirements described
above.
1. An entity type DEPARTMENT with attributes Name, Number, Locations, Manager, and
Manager_start_date. Locations is the only multivalued attribute. We can specify that both Name
and Number are (separate) key attributes because each was specified to be unique.
2. An entity type PROJECT with attributes Name, Number, Location , and Controlling_department.
Both Name and Numberare (separate) key attributes.
3. An entity type EMPLOYEE with attributes Name, Ssn , Sex, Address, Salary, Birth_date,
Department, and Supervisor. Both Name and Address may be composite attributes; however, this
was not specified in the requirements. We must go back to the users to see if any of them will refer
to the individual components of Name—First_name , Middle_initial, Last_name —or of Address.
4. An entity type DEPENDENT with attributes Employee, Dependent_name, Sex, Birth_date, and
Relationship(to the employee).

So far, we have not represented the fact that an employee can work on several projects, nor have we
represented the number of hours per week an employee works on each project. This characteristic is
listed as part of the third requirement and it can be represented by a multivalued composite attribute of
EMPLOYEE called Works_on with the simple components (Project, Hours). Alternatively, it can be
represented as a multivalued composite attribute of PROJECT called Workers with the simple

AASTU Compiled by Chere L. (M.Tech) Page 15


Fundamentals of Database System
components (Employee, Hours). The Name attribute of EMPLOYEE is shown as a composite attribute,
presumably after consultation with the users

Exercises
1. Consider the following set of requirements for a UNIVERSITY database that is used to keep track of
students’ transcripts.
a) The university keeps track of each student’s name, student number, Social Security number,
current address and phone number, permanent address and phone number, birth date, sex,
class (freshman, sophomore, ..., grad-uate), major department, minor department (if any), and
degree program (B.A., B.S., ..., Ph.D.). Some user applications need to refer to the city, state,
and ZIP Code of the student’s permanent address and to the stu-dent’s last name. Both Social
Security number and student number have unique values for each student.
b) Each department is described by a name, department code, office num-ber, office phone
number, and college. Both name and code have unique values for each department.
c) Each course has a course name, description, course number, number of semester hours, level,
and offering department. The value of the course number is unique for each course.

AASTU Compiled by Chere L. (M.Tech) Page 16


Fundamentals of Database System
d) Each section has an instructor, semester, year, course, and section num-ber. The section
number distinguishes sections of the same course that are taught during the same
semester/year; its values are 1, 2, 3, ..., up to the number of sections taught during each
semester.
e) A grade report has a student, section, letter grade, and numeric grade (0,1, 2, 3, or 4).
Design an ER schema for this application, and draw an ER diagram for the schema. Specify key
attributes of each entity type, and structural constraints on each relationship type. Note any unspecified
requirements, and make appropriate assumptions to make the specification complete.

2. Design an ER schema for keeping track of information about votes taken in the U.S. House of
Representatives during the current two-year congressional session. The database needs to keep track
of each U.S. STATE ’s Name (e.g.,‘Texas’, ‘New York’, ‘California’) and include the Region of
the state (whose domain is {‘Northeast’, ‘Midwest’, ‘Southeast’, ‘Southwest’, ‘West’}). Each
CONGRESS_PERSON in the House of Representatives is described by his or her Name, plus the
District represented, the Start_date when the congress person was first elected, and the political
Party to which he or she belongs (whose domain is {‘Republican’, ‘Democrat’, ‘Independent’,
‘Other’}). The database keeps track of each BILL(i.e., proposed law), including the Bill_name, the
Date_of_vote on the bill, whether the bill Passed_or_failed (whose domain is {‘Yes’, ‘No’}), and the
Sponsor (the congressperson(s) who sponsored—that is, proposed—the bill). The database also
keeps track of how each congressperson voted on each bill (domain of Vote attribute is {‘Yes’, ‘No’,
‘Abstain’, ‘Absent’}). Draw an ER schema diagram for this application. State clearly any
assumptions you make
3. A database is being constructed to keep track of the teams and games of a sports league. A team has
a number of players, not all of whom participate in each game. It is desired to keep track of the
players participating in each game for each team, the positions they played in that game, and the
result of the game. Design an ER schema diagram for this application, stating any assumptions you
make. Choose your favorite sport (e.g., soccer, baseball, football).
4. Consider an entity type SECTION in a UNIVERSITY database, which describes the section
offerings of courses. The attributes of SECTION are Section_number, Semester, Year ,
Course_number , Instructor, Room_no (where section is taught), Building (where section is taught),
Weekdays(domain is the possible combinations of weekdays in which a section can be offered
{‘MWF’, ‘MW’, ‘TT’, and so on}), and Hours (domain is all possible time periods during which
sections are offered {‘9–9:50 A . M .’, ‘10–10:50 A . M .’, ...,‘3:30–4:50 P.M.’, ‘5:30–6:20 P.M.’,
and so on}). Assume that Section_number is unique for each course within a particular
semester/year combination (that is, if a course is offered multiple times during a particular semester,
its section offerings are numbered 1, 2, 3, and so on). There are several composite keys for section,
and some attributes are components of more than one key. Identify three composite keys, and show
how they can be represented in an ER schema diagram.

AASTU Compiled by Chere L. (M.Tech) Page 17


Fundamentals of Database System
4.6 Enhanced E-R (EER) Models
The EER model includes all the modeling concepts of the ER model that were presented in earlier
discussion of this chapter. In addition, it includes the concepts of subclass and superclass and the related
concepts of specialization and generalization. Another concept included in the EER model is that of a
category or union type, which is used to represent a collection of objects (entities) that is the union of
objects of different entity types. Associated with these concepts is the important mechanism of attribute
and relationship inheritance. Unfortunately, no standard terminology exists for these concepts, so we use
the most common terminology and we also describe a diagrammatic technique for displaying these
concepts when they arise in an EER schema. We call the resulting schema diagrams enhanced ER or
EER diagrams.
The EER model can describe as follow:
 Object-oriented extensions to E-R model
 EER is important when we have a relationship between two entities and the participation is
partial between entity occurrences. In such cases EER is used to reduce the complexity in
participation and relationship complexity.
 ER diagrams consider entity types to be primitive objects
 EER diagrams allow refinements within the structures of entity types
 EER Concepts
 Sub classes  Specialization
 Super classes  Attribute Inheritance
 Generalization
 Constraints on specialization and generalization

(a) Subclass/Subtype Vs Superclass /Supertype


As we discussed in previously in this chapter, an entity type is used to represent both a type of entity and
the entity set or collection of entities of that type that exist in the database. For example, the entity type
EMPLOYEE describes the type (that is, the attributes and relationships) of each employee entity, and
also refers to the current set of EMPLOYEE entities in the COMPANY database.
In many cases an entity type has numerous sub-groupings or subtypes of its entities that are meaningful
and need to be represented explicitly because of their significance to the database application. For
example, the entities that are members of the EMPLOYEE entity type may be distinguished further into
SECRETARY, ENGINEER, MANAGER, TECHNICIAN, SALARIED_EMPLOYEE,
HOURLY_EMPLOYEE , and so on. The set of entities in each of the latter groupings is a subset of the
entities that belong to the EMPLOYEE entity set, meaning that every entity that is a member of one of
these sub-groupings is also an employee. We call each of these sub-groupings a subclass or subtype of
the EMPLOYEE entity type, and the EMPLOYEE entity type is called the superclass or supertype for

AASTU Compiled by Chere L. (M.Tech) Page 18


Fundamentals of Database System
each of these subclasses. The Figure below shows how to represent these concepts diagrammatically in
EER diagrams. (The circle notation in Figure will be explained in later on);

Superclass/Supertype Entity
• Is the generalized entity
• An entity type whose tuples share common attributes. Attributes that are shared by all entity
occurrences (including the identifier) are associated with the supertype.

Subclass/Subtype Entity
• An entity type whose tuples have attributes that distinguish its members from tuples of the
generalized or Superclass entities.
• When one generalized Superclass has various subgroups with distinguishing features and these
subgroups are represented by specialized form, the groups are called subclasses.
• Subclasses can be either mutually exclusive (disjoint) or overlapping (inclusive).
• A single subclass may inherit attributes from two distinct superclasses.
• A mutually exclusive category/subclass is when an entity instance can be in only one of the
subclasses.
E.g.: An EMPLOYEE can either be SALARIED or PART-TIMER but not both.
• An overlapping category/subclass is when an entity instance may be in two or more subclasses.
E.g.: A PERSON who works for a university can be both EMPLOYEE and a
STUDENT at the same time.

AASTU Compiled by Chere L. (M.Tech) Page 19


Fundamentals of Database System
(b) Specialization
• Specialization is the process of defining a set of subclasses of an entity type;
• The set of subclasses that forms a specialization is defined on the basis of some distinguishing
characteristic of the entities in the superclass.
• Specialization process identify the distinguishing features of some entity occurrences, and
specialize them into different subclasses.
• Specialized entity is the result of subset of higher level entity set to form lower level entity set.
• The specialized entities will have additional set of attributes (distinguishing characteristics)
that distinguish them from the generalized entity.
• Is considered as Top-Down definition of entities.
• Reasons for Specialization
o Attributes only partially applying to superclasses
o Relationship types only partially applicable to the superclass
• In many cases, an entity type has numerous sub-groupings of its entities that are meaningful
and need to be represented explicitly. This need requires the representation of each subgroup in
the ER model. The generalized entity is a superclass and the set of specialized entities will be
subclasses for that specific Superclass.
 Example: Saving Accounts and Current Accounts are Specialized entities for the
generalized entity Accounts. Manager, Sales, Secretary: are specialized employees.
(c) Generalization
• Generalization is the process of defining a more general entity type from a set of more
specialized entity types.
• A generalization hierarchy is a form of abstraction that specifies that two or more entities that
share common attributes can be generalized into a higher level entity type.
• Generalization occurs when two or more entities represent categories of the same real-world
object.
• Is considered as bottom-up definition of entities.
• Generalization hierarchy depicts relationship between higher level superclass and lower level
subclass.
Generalization hierarchies can be nested. That is, a subtype of one hierarchy can be a supertype of
another. The level of nesting is limited only by the constraint of simplicity.

AASTU Compiled by Chere L. (M.Tech) Page 20


Fundamentals of Database System
Example: Vehicle is a generalized form for Car and Truck

Relationship Between Superclass and Subclass


 The relationship between a superclass and any of its subclasses is called a superclass/subclass or
class/subclass relationship
 An instance can not only be a member of a subclass. i.e. Every instance of a subclass is also an
instance in the Superclass.
 A member of a subclass is represented as a distinct database object, a distinct record that is
related via the key attribute to its super-class entity.
 An entity cannot exist in the database merely by being a member of a subclass; it must also be a
member of the super-class.
 An entity occurrence of a sub class not necessarily should belong to any of the subclasses unless
there is full participation in the specialization.
 A member of a subclass is represented as a distinct database object, a distinct record that is
related via the key attribute to its super-class entity.
 The relationship between a subclass and a Superclass is an “IS A” or “IS PART OF” type.
 Subclass IS PART OF Superclass
 Manager IS AN Employee
 All subclasses or specialized entity sets should be connected with the superclass using a line to a
circle where there is a subset symbol indicating the direction of subclass/superclass relationship.

AASTU Compiled by Chere L. (M.Tech) Page 21


Fundamentals of Database System

 We can also have subclasses of a subclass forming a hierarchy of specialization.


 Superclass attributes are shared by all subclasses of that superclass
 Subclass attributes are unique for the subclass.

(d) Attribute Inheritance


 An entity that is a member of a subclass inherits all the attributes of the entity as a member of the
superclass.
 The entity also inherits all the relationships in which the superclass participates.
 An entity may have more than one subclass categories.
 All entities/subclasses of a generalized entity or superclass share a common unique identifier
attribute (primary key). i.e. The primary key of the superclass and subclasses are always
identical.

 Consider the EMPLOYEE supertype entity shown above. This entity can have several different
subtype entities (for example: HOURLY and SALARIED), each with distinct properties not
shared by other subtypes. But whether the employee is HOURLY or SALARIED, same
attributes (EmployeeId, Name, and DateHired) are shared.
 The Supertype EMPLOYEE stores all properties that subclasses have in common. And
HOURLY employees have the unique attribute Wage (hourly wage rate), while SALARIED
employees have two unique attributes, StockOption and Salary.

AASTU Compiled by Chere L. (M.Tech) Page 22


Fundamentals of Database System
(e) Constraints on specialization and generalization

Completeness Constraint.
• The Completeness Constraint addresses the issue of whether or not an occurrence of a Super
class must also have a corresponding Subclass occurrence.
• The completeness constraint requires that all instances of the subtype be represented in the super
type.
• The Total Specialization Rule specifies that an entity occurrence should at least be a member of
one of the subclasses. Total Participation of super class instances on subclasses is diagrammed
with a double line from the Super type to the circle as shown below.
E.g.: If we have EXTENTION and REGULAR as subclasses of a super class STUDENT,
then it is mandatory that each student to be either EXTENTION or REGULAR student.
Thus the participation of instances of STUDENT in EXTENTION and REGULAR
subclasses will be total.

• The Partial Specialization Rule specifies that it is not necessary for all entity occurrences in the
superclass to be a member of one of the subclasses. Here we have an optional participation on
the specialization. Partial Participation of superclass instances on subclasses is diagrammed with
a single line from the Supertype to the circle.
E.g.: If we have MANAGER and SECRETARY as subclasses of a superclass EMPLOYEE,
thenit is not the case that all employees are either manager or secretary. Thus the
participation of instances of employee in MANAGER and SECRETARY subclasses
will be partial.

AASTU Compiled by Chere L. (M.Tech) Page 23


Fundamentals of Database System
Disjointness Constraints
• Specifies the rule whether one entity occurrence can be a member of more than one subclasses.
i.e. it is a type of business rule that deals with the situation where an entity occurrence of a
Superclass may also have more than one Subclass occurrence.
• The Disjoint Rule restricts one entity occurrence of a superclass to be a member of only one of
the subclasses. Example: aEMPLOYEE can either be SALARIED or PART-TIMER, but not
the both at the same time.
• The Overlap Rule allows one entity occurrence to be a member f more than one
subclass. Example: EMPLOYEE working at the university can be both a STUDENT and an
EMPLOYEE at the same time.
• This is diagrammed by placing either the letter "d" for disjoint or "o" for overlapping inside the
circle on the Generalization Hierarchy portion of the E-R diagram.

The two types of constraints on generalization and specialization (Disjointness and Completeness
constraints) are not dependent on one another. That is, being disjoint will not favour whether the tuples
in the superclass should have Total or Partial participation for that specific specialization.
From the two types of constraints we can have four possible constraints
 Disjoint AND Total  Overlapping AND Total
 Disjoint AND Partial  Overlapping AND Partial

AASTU Compiled by Chere L. (M.Tech) Page 24

You might also like