Data Base Manageme NT System (DBMS)

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 36

Data Base

Manageme
nt System
(DBMS)
Conte
nts
Data Hierarchy
 Traditional File Processing
 Database approach to Data Management
 DBMS- Features and Capabilities
 Database Schemas
 Components of DBMS
 Data Models
 RDBMS
 Normalization
 What is it and why is it required?
 Background of Normalization: Definitions
 The process of normalization
Data
Hierarchy
Data Hierarchy refers to the systematic organization of
data, often in a hierarchical form. A computer system
organizes data in a hierarchy that starts with bits and bytes
and progresses to fields, records, files, and databases. A bit
represents the smallest unit of data a computer can handle.
A group of bits, called a byte, represents a single
character, which can be a letter, a number, or another
symbol

Data organization involves fields, records, files and so on.


 A field holds a single fact - Consider a date field, e.g.
"September 19, 2004". This can be treated as a single date
field (eg birthdate), or 3 fields, namely, month, day of
month and year.
 A record a collection of related fields. An Employee record
may contain a name field(s), address fields, birthdate
Data
Hierarchy
A file is a collection of related records. If there are 100
employees, then each employee would have a record (e.g.
called Employee Personal Details record) and the
collection of 100 such records would constitute a file (in
this
case, called Employee Personal Details file).

 Files are integrated into a Database. This is done using a


Database Management System. If there are other facets
of employee data that we wish to capture, then other files
such as Employee Training History file and Employee
Work History file could be created as well.
Traditional File
Processing
The use of a traditional approach to file processing encourages
each functional area in a corporation to develop specialized
applications. Each application requires a unique data file that is
likely to be a subset of the master file. These subsets of the
master file lead to data redundancy and inconsistency, processing
inflexibility, and wasted storage resources.

 Each application, requires its own files and its own computer
program to operate. For example, the human resources functional
area might have a personnel master file, a payroll file, a medical
insurance file, a pension file, a mailing list file, and so forth until
tens, perhaps hundreds, of files and programs existed. In the
company as a whole, this process led to multiple master files
created, maintained, and operated by separate divisions or
departments. As this process goes on for 5 or 10 years, the
organization is saddled with hundreds of programs and
applications that are very difficult to maintain and manage. The
resulting problems are data redundancy and
inconsistency, program-data dependence, inflexibility, poor data
security, and an inability to share data among applications.
Database approach
to Data
Management
Database
A database is a logically coherent collection of data with some
inherent meaning, representing some aspect of real world
and which is designed, built and populated with data for a
specific purpose .
DBMS
A Data Base Management System (DBMS) is a set of software
programs that enables users to define, create and maintain a
database. The DBMS also enforces necessary access restrictions
and security measures in order to protect the database.

Database technology cuts through many of the problems a traditional


file organization creates. Database serves many applications
efficiently by centralizing the data and controlling redundant data.
Rather than storing data in separate files for each
application, data are stored so as to appear to users as being
stored in only one location.
For example, instead of a corporation storing employee data in
separate information systems and separate files for
personnel, payroll, and benefits, the corporation creates a single
common human resources database
DBMS Features and Capabilities
DBMS Features and
Capabilities
Query abilty: Querying is the process of requesting attribute
information from various perspectives and combinations of
factors.

 Backup and Replication: Copies of attributes are regularly


created to cater to the situation when primary disks or other
equipment fails. Data is consistently replicated among various
database servers.

 Rule Enforcement: Application of rules to attributes so that


attributes are clean and reliable – ability to add and updates to
rules without significant data layout redesign.

 Security: Application of limits for who can see or change which


attributes or groups of attributes.

 Controlling of Redundancy
DBMS Features and
Capabilities
Computation: There are common computations requested on
attributes such as
counting, summing, averaging, sorting, grouping, cross-
referencing, etc

 Change and access logging: Often one wants to know who


accessed what attributes, what was changed, and when it was
changed. Logging services allow this by keeping a record of
access occurrences and changes

 Automated Optimization: If there are frequently occurring


usage patterns or requests, some DBMS can adjust themselves to
improve the speed of those interactions. In some cases the
DBMS will merely provide tools to monitor performance, allowing
a human expert to make the necessary adjustments after
reviewing the statistics collected.

 Provides multiple user interfaces


Database Schema
Database
Schema
Database Schema: A database schema is its structure described in
a formal language supported by the DBMS. In a relational
database, the schema defines the tables, the fields in each
table, and the relationships between fields and tables.

The three levels of abstractions are:


1. Physical level: the lowest level of abstraction describes how data
is stored: files, indices, etc. on the random access disk system.
It also typically describes the record layout of files and type of
files (hash, b-tree, flat).
2. Logical level: Hides details of the physical level. In the relational
model, this schema presents data as a set of tables. The
DBMS maps data access between the logical to physical schemas
automatically.
 Physical schema can be changed without changing application:
 DBMS must change mapping from conceptual to physical.
 Referred to as physical data independence.
Database Schema,
contd.
3. View level (External Schema):
It is tailored to the needs of a particular
category of users. Portions of stored data
should not be seen by some users and
simplifies the view for these users. E.g.
students should not see faculty salaries.

Applications are written in terms of an


external schema. The external view is
computed when accessed. It is not
stored. Translation from external level to
logical level is done automatically by
DBMS at run time. The conceptual
schema can be changed without changing
application.
Mapping from external to conceptual must
be changed. This is referred
as conceptual data independence.
Components of DBMS
Components of
DBMS
A database management system has three components:

1. A data definition language (DDL) is the formal


language programmers use to specify the structure of the
content of the database. DDL defines each data element
as it appears in the database before that data element is
translated into the forms required by application
programs. With this help a data scheme can be defined
and also changed later.

Typical DDL operations (with their respective keywords in


SQL):
 Creation of tables and definition of attributes
(CREATE TABLE ...)
 Change of tables by adding or deleting
attributes
(ALTER TABLE …)
 Deletion of whole table including content (!) (DROP
TABLE …)
Components of
DBMS
A data manipulation language (DML) is a language
2.
for the descriptions of the operations with data like
store, search, read, change, etc. the so-called data
manipulation, is needed. Typical DML operations (with
their respective keywords in the structured query
language SQL):
 Add data (INSERT)
 Change data (UPDATE)
 Delete data (DELETE)
 Query data (SELECT)

Often DDL and DML for the definition and manipulation of


databases are combined in one comprehensive language.
A good example is the structured query language SQL.
Components of
DBMS
Data Dictionary: This is an automated or manual file
3.
that stores definitions of data elements and data
characteristics, such as usage, physical
representation, ownership (who in the organization is
responsible for maintaining the data), authorization, and
security.

Many data dictionaries can produce lists and reports of


data use, groupings, program locations, and so on.
Data Models
Data
Models
A data model is a theory or specification describing how a database
is structured and used.
A data model is not just a way of structuring data: it also defines a
set of operations that can be performed on the data. The
relational model, for example, defines operations such as
select, and join. Although these operations may not be explicit in a
particular query language, they provide the foundation on which
a query language is built.

Common Data Models:


 Hierarchical Model
 Network Model
 Relational Model
 Object Model (Object Oriented Database Management System)

The relational model is the most widely used model today.


Hierarchical
Model
In a hierarchical model, the data is
organized into a tree-like structure.
The structure allows repeating
information using parent/child
relationships: each parent can have
many children but each child only has
one parent. This structure is simple but
nonflexible because the relationship is
confined to a one-to-many relationship.
These models were popular in late
1960s, and in 1970. The most widely
used hierarchical databases
is IMS developed by IBM.
Network
Model
The network model is a variation on the
hierarchical model – allowing each record to
have multiple parent and child records.

Network models generally implement the set


relationships by means of pointers that
directly address the location of a record on
disk. This gives excellent retrieval
performance, at the expense of operations
such as database loading and
reorganization.

Some well known DBMS using Network


Model:
 Honeywell IDS (Integrated Data
Store)
 IDMS (Integrated Database
Management
System)
Relational
Model
The data is stored in two-dimensional tables (rows and columns).
The data is manipulated based on the relational theory of
mathematics.

Properties of Relational Tables:


 Values Are Atomic
 Each Row is Unique
 Column Values Are of the Same Kind
 The Sequence of Columns is Insignificant
 The Sequence of Rows is Insignificant
 Each Column Has a Unique Name

A relational database management system (RDBMS) is a DBMS that


is based on the relational model.

Some well known RDBMS:


IBM DB2, Informix, Microsoft SQL Server, Microsoft Visul
Foxpro, MySQL, Oracle, Sybase, Teradata, Microsoft Access
Object
Model
Object model (ODBMS, object-oriented database management
system): The data is stored in the form of objects, which are
structures called classes that display the data within. The fields are
instances of these classes .

The object oriented structure has the ability to handle


graphics, pictures, voice and text, types of data, without difficultly
unlike the other database structures. This structure is popular for
multimedia Web-based applications. It was designed to work with
object-oriented programming languages such as Java.
RDBMS
RDBM
S
A RDBMS stores information in a set of "tables", each of which has a
unique identifier or "primary key” (PK). The tables are then related to
one another using "foreign keys” (FK). A foreign key is simply the
primary key in a different table.

In the example above, "Customer ID" is the PK in one table and the
FK in another. The arrow represents a one-to-many relationship
between the two tables. The relationship indicates that one customer
can have one or more orders. A given order, however, can be initiated
by one and only one customer.
Normalization
Normaliza
tion
Normalization is a systematic way of ensuring that a database structure is
suitable for general-purpose querying and free of certain undesirable
characteristics that could lead to a loss of data integrity.

The objectives of normalization:


 Free the database of modification anomalies
 Minimize redesign when extending the database structure
 Make the data model more informative to users
 Avoid bias towards any particular pattern of querying

In general, relational databases should be normalized to the "third normal


form".
Background to Normalization:
Definitions
Functional Dependency: If A and B are attributes of relation R, B is
functionally dependent on A (denoted A  B), if each A value is
associated with precisely one B value.

Or in other words, In every possible legal value of R (relation),


whenever two tuple agree on their A values, they also agree on their
B value.
Determinant of a functional dependency refers to attribute or group of
attributes on left-hand side of the arrow.

e.g. in an "Employee" table that includes the attributes "Employee ID" and
"Employee Date of Birth", the functional dependency {Employee ID} →
{Employee Date of Birth} would hold.
Background to Normalization:
Definitions
Full Functional Dependency
 A and B are attributes of a relation,
 B is fully dependent on A if B is functionally dependent on A but
not on any proper subset of A.

A functional dependency X  Y is full functional dependency if removal of


any attribute A from X means that the dependency does not hold any
more.
Background to Normalization:
Definitions
Transitive Dependency: A transitive dependency is an indirect functional
dependency. Let A, B, and C designate three distinct attributes in the
relation. Suppose all three of the following conditions hold:
 A → B
 It is not the case that B → A
 B → C
Then the functional dependency A → C is a transitive dependency.

The functional dependency {Book} → {Author Nationality} applies; that is,


if
we know the book, we know the author's nationality. Furthermore:
 {Book} → {Author}
 {Author} → {Author Nationality}
 {Author} does not → {Book}
Therefore {Book} → {Author Nationality} is a transitive dependency.
Background to Normalization:
Definitions
An Index or Key is an attribute or collection of attributes that may be used
to identify or retrieve one or more records.

SuperKey: A superkey is a set of columns within a table whose values can


be
used to uniquely identify a row.
e.g. Imagine a table with the fields <Name>, <Age>, <SSN> and <Phone
Extension>. This table has many possible superkeys. Three of these are
<SSN>, <Phone Extension, Name> and <SSN, Name>. Of those
listed, only <SSN> is a candidate key, as the others contain information
not necessary to uniquely identify records

A candidate key is a key that can be used to uniquely identify record. I.e., it
may be used to retrieve one specific record.
The primary key of a relation is a candidate key that has been designated
as the main key.
A foreign key is an attribute (or collection of attributes) in a relation that can
be used as a key to another relation. Foreign keys link tables together to
form an integrated database.
The Process of Normalization
The Process of
Normalization
There are two main steps of the normalization process:
eliminate redundant data (for example, storing the same
data in more than one table) and ensure data dependencies
make sense (only storing related data in a table). Both of
these are worthy goals as they reduce the amount of space
a database consumes and ensure that data is logically
stored.

 Formal technique for analyzing a relation based on its


primary key and functional dependencies between
its attributes.
 Often executed as a series of steps. Each step
corresponds to a specific normal form, which has known
properties.
 As normalization proceeds, relations become progressively
more restricted (stronger) in format and also less
vulnerable to update anomalies.
First Normal Form
(1NF)
No Repeating Elements or Groups of Elements

A relation in which intersection of each row and column contains one


and only one value.
 All key attributes get defined
 No repeating groups in table
 All attributes dependent on primary key

UNF to 1NF:
 Eliminate duplicative columns from the same table (In other
words.. Remove subsets of data that apply to multiple rows of a
table and place them in separate tables.).
 Create separate tables for each group of related data and identify
each row with a unique column or set of columns (the primary
key).
 Create relationships between these new tables and their
predecessors through the use of foreign keys.
Second Normal Form
(2NF)
No Partial Dependencies on a Concatenated Key

A relation that is in 1NF and every non-primary-key attribute is fully


functionally dependent on the primary key (no partial dependency).

1NF to 2NF:
 Identify primary key for the 1NF relation.
 Identify functional dependencies in the relation.
 If partial dependencies exist on the primary key remove them by placing
them in a new relation along with copy of their determinant (in other
words, remove columns that are not fully dependent upon the primary
key).
 Create relationships between these new tables and their predecessors
through the use of foreign keys.
Third Normal Form
(3NF)
No Dependencies on Non-Key Attributes

A relation that is in 1NF and 2NF and in which no non-primary-key attribute is


transitively dependent on the primary key.

2NF to 3NF
 Identify the primary key in the 2NF relation.
 Identify functional dependencies in the relation.
 If transitive dependencies exist on the primary key remove them by
placing them in a new relation along with copy of their determinant.
Thank You

You might also like