Redundancy Dependency Loss of Information
Redundancy Dependency Loss of Information
Redundancy Dependency Loss of Information
It is also useful to minimize the of use of null values and the prevention of loss of
information.
• Normalization allows the database designer to understand the current data structures in
an organization.
• The inventor of the relational model Edgar Codd proposed the theory of normalization
with the introduction of First Normal Form, and he continued to extend theory with
second and third Normal Form. 1
Normalization cont’d…
Normalization Principles
- Relational design principles for normalized relations:
- Any relation that is not well-formed should be broken down into two or more well-
formed relations
• TIP: as a general rule, a well-formed relation will not encompass more than one
business concept!
2
Aims of Normalization
Physical space needed to store data is reduced.
It ensures that the database is structured in the best possible way (data becomes better
organized)
To ensure tables have a flexible structure. E.g. number of classes taken or books
borrowed should not be limited
3
Anomalies
Anomalies are inconvenient or error-prone situations arising when we process the
tables.
i. Update anomalies:-an update anomaly exists when one or more instances of duplicated
data is updated, but not all.
For example, consider Jones moving address - you need to update all instances of
Jones's address.
4
Anomalies cont’d…
StudentNum CourseNum Student Address Course
Name
5
Anomalies cont’d…
ii. Delete Anomalies:-a delete anomaly exists when certain attributes are lost because of
the deletion of other attributes.
Consider the table above, what happens if student S30 is discarded to leave the course?
6
Anomalies cont’d…
iii. Insert Anomalies:-an insert anomaly occurs when certain attributes cannot be
inserted into the database without the presence of other attributes.
- For example this is the converse of delete anomaly - we can't add a new course unless we
have at least one student enrolled on that course.
7
Anomalies cont’d…
8
Stages of Normalization
• It involves the process of applying a series of tests on a relation to determine whether it
satisfies or violets the requirements of a given normal form.
• When a test fails, the relation is decomposed into simpler relations that individually meet
the normalization tests.
• The higher the normal form the less vulnerable to update anomalies the relation
becomes.
• Three normal forms: 1NF, 2NF and 3NF where initially proposed by Codd.
• All these normal forms are based on the functional dependencies among the attributes of
a relation.
9
Stages of Normalization
All these normal forms are based on the functional dependencies among
the attributes of a relation.
10
Normalization Stages cont’d…
• Normalization follows a staged process that obeys a set of rules. The steps of
normalization are:
• Step 1: Select the data source and convert into an un normalized table (UNF)
• Step 2: Transform the un normalized data into first normal form (1NF)
• Step 3: Transform data in first normal form (1NF) into second normal form (2NF)
• Step 4: Transform data in second normal form (2NF) into third normal form (3NF)
11
Normalization Stages cont’d…
12
First Normal Form (1NF)
• A table is said to be in its 1NF, if there is no multi-valued attributes. In 1NF:
Why?
Because the table found in the right side doesn’t contain repeating groups (no
multi-valued attribute) 14
First Normal Form (1NF) cont’d…
• This way of eliminating multi-valued attribute has its own serious draw back like
update anomaly.
16
First Normal Form (1NF) cont’d…
b. The other method is decomposing the table in two tables
17
First Normal Form (1NF) cont’d…
A table containing multi-valued attributed can also converted in to 1NF table
by changing it into atomic attribute
N.B::
- There is a structural change of a
table and
- Storage space wastage
18
First Normal Form (1NF) cont’d…
• STUDENT: Un normalized table
19
First Normal Form (1NF) cont’d…
• To convert the above table from un normalized form to 1NF, simply convert
any repeated attributes in to part of the candidate key
20
First Normal Form (1NF) cont’d…
• STUDENT: First Normal Forma table
21
Second Normal Form (2NF)
• A table is said to be in 2NF if both the following conditions hold:
Table is in 1NF
22
Functional Dependency
• The concept of functional dependency is central to normalization and, in particular,
strongly related to 2NF
• Functional dependency is the relationship that describes how the value of one attribute
may be used to find the value of another attribute.
• Determinant
• It is an attribute that can be used to find the value of another attribute in the relation.
Example: If ‘X’ is a set of attributes within a relation, then we say ‘A’ (an attribute or set
of attributes), is functionally dependent on X, there is only one corresponding value of A.
23
• For example the value of attribute name and city could be determined by
knowing the value of Reg. #.
• Reg #Name, City
24
Functional Dependency cont’d….
• Partial dependency
• Transitive dependency
25
Functional Dependency cont’d….
• Partial dependency: it is a dependency where non-key attributes functionally depend on
any parts of the composite key.
26
Functional Dependency cont’d….
• In the above table knowing the value of the attribute Emp_ID could help to determine the
value the non-key attribute name (Graphically: Emp_IDName).
• Hence the non-key attribute name is partially dependent on the composite key.
• Similarly knowing the value of SW-ID could determine the value of the non-key attribute
SW-Title (Graphically: SW-IDSW-Title).
27
Functional Dependency cont’d….
• Full Dependency: It is a dependency where non-key attributes are functionally
dependent on complete key.
• For example: the value of Hrs-Worked can be determined only knowing the values of
the composite keys (Emp-ID and SW-ID).
• Therefore the attribute Hrs-Worked is fully functionally dependent on Emp-ID and SW-
ID (Graphically: Emp-ID, SW-IDHrs-Worked).
• For example: The non-key attribute date-completion could be determined by the other
non-key attribute Project ID (Graphically: Project-IDDate-Completion).
28
Functional Dependency cont’d….
29
Second Normal Form (2NF)
• A table is said to be in 2NF, if it is in 1NF and no column that is not part of the primary
key is dependent only on a portion of the primary key.
• If we have relational table containing full dependency along with partial dependencies
can be decomposed as shown below.
• The determinant of each partial dependency table can be the primary key of the
corresponding table.
30
Second Normal Form (2NF) cont’d….
31
Second Normal Form (2NF) cont’d…
• For example if we take the following table, definitely it satisfies the rules of 1NF (no
multivalued attribute), but not 2NF.
32
Second Normal Form (2NF) cont’d…
• In the above table knowing the value of the attribute Emp_ID could help to determine the
value the non-key attribute name (Graphically: Emp_IDName).
• Hence the non-key attribute name is partially dependent on the composite key.
• Similarly knowing the value of SW-ID could determine the value of the non-key attribute
SW-Title (Graphically: SW-IDSW-Title).
• I.e. SW-Title is partially dependent on the composite key.
• Full Dependency: It is a dependency where non-key attributes are functionally
dependent on complete key.
33
Second Normal Form (2NF) cont’d…
• For example: the value of Hrs-Worked can be determined only knowing the values of
the composite keys (Emp-ID and SW-ID).
• Therefore the attribute Hrs-Worked is fully functionally dependent on Emp-ID and SW-
ID (Graphically: Emp-ID, SW-IDHrs-Worked).
• For example: The non-key attribute date-completion could be determined by the other
non-key attribute Project ID (Graphically: Project-IDDate-Completion).
34
Second Normal Form (2NF) cont’d…
Second Normal Form (2NF)
• It is in 1NF PLUS every non-key attribute is fully functionally dependent on the entire
primary key (i.e. every non-key attribute must be defined by the entire key, not by only
part of the key).
• No partial dependency
• Hence, the concept of functional dependency is central to normalization and, in
particular, strongly related to 2NF.
• If we have relational table containing full dependency along with partial dependencies
can be decomposed as shown below.
• The determinant of each partial dependency table can be the primary key of the
corresponding table.
35
Second Normal Form (2NF) cont’d…
• For example if we take the following table, definitely it satisfies the rules of 1NF (no
multivalued attribute), but not 2NF.
• So we are forced to stop adding row. Therefore there is a problem of insert anomaly
36
Second Normal Form (2NF) cont’d…
• In addition multiple updates are needed as is redundantly recorded (Employee
name and software title), update anomaly.
• If we delete the last row, the information associated with it also be deleted like the
course Visual Basic as only a single employee is working with it, delete anomaly.
37
Second Normal Form (2NF) cont’d…
38
Second Normal Form (2NF) cont’d…
• Tips:
• Remove any key attributes (partial dependencies) that only depend on part of the table
key to a new table.
• What has to be determined is “is field A dependent upon field B or vice versa?”
• This means: “Given a value for A, do we then have only one possible value for B, and
vice versa?”
• If the answer is yes, A and B should be put into a new relation with A becoming the
primary key.
39
Second Normal Form (2NF) cont’d…
The process is as follows:
• Take each non-key attribute in turn and ask the question: is this attribute dependent on
one part of the key?
• If yes, remove the attribute to a new table with a copy of the part of the key it is
dependent upon. The key is dependent up on becomes the key in the new table.
Underline the key in this new table.
• If no, check against other part of the key and repeat the above process.
• If still no, i.e. not dependent on either part the key, keep attributes in the current table.
40
Second Normal Form (2NF) cont’d…
Functional Dependency
• It is clear that:
• RefNo->Name, Adreess. Or, most correctly,
• AccNo, RefNo->Name, Adress, Status
42
Second Normal Form (2NF) cont’d…
Tables in 2nd NF
43
Third Normal Form
A table is in the 3NF:
If it is in the 2NF and transitive functional dependency of non-prime attribute of any
supper key should be removed
Or there should not be a non-key columns dependent on other non-key columns
(transitive dependency) that could not act as a primary key.
• Solution: Non-key determinant with transitive dependency goes into a new table; non-
key determinant becomes primary key in the new table and remains as a foreign key in
the old table.
44
Third Normal Form cont’d….
45
Third Normal Form cont’d….
• (I.e. move the dependent attribute, together with a copy of the non-key attribute upon
which it is dependent, to a new table).
• Make the non key attribute, upon which it is dependent, the key in the new table.
Underline the key in this new table.
• Leave the non-key attribute, upon which it is dependent, in the original table and
mark it a foreign key.
46
Third Normal Form cont’d….
47
Third Normal Form cont’d….
48
Third Normal Form cont’d….
49
Third Normal Form cont’d….
• From the above 3NF table we can easily observe that is no anomaly at all.
• Now we can easily add new project without knowing the existence of employee
• We can also add new employee without knowing the existence of project
50
Third Normal Form cont’d….
b. Multi attribute determinant
51
Third Normal Form cont’d….
52
Not in 3NF
53
Table in 3NF
54
Exercise
Des it satisfy 2NF?
NO; Why?
- Here the primary key is (Studio,
move) and city depends only on the
studio but not on the whole key
- So, it is not in 2NF
55
Exercise
Solution
56
Exercise
57
Solution
3NF
58
Does this table satisfy 3NF?
No
Why
60
61