DBMS

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 61

DBMS

INTRODUCTION
Data
raw facts and figures

Data Processing
performing operations on the input data to generate output.
Data are logically organized into:
1. Bits (characters)
2. Fields
3. Records
4. Files
5. Databases
What is Database?

Database is a computer based record keeping system which is used to


record ,maintain and retrieve data. It is an organized collection of
interrelated data.
What is Database Management System
(DBMS)?
• Collection of interrelated data
• Set of programs to access the data
• It provides a convenient and efficient way to store, retrieve and
modify information.
• Application programs request DBMS to retrieve,
modify/insert/delete data for them and thus it acts as a layer of
abstraction between the application programs and the file
system.
DATABASE APPLICATIONS:
• Banking: all transactions
• Airlines: reservations, schedules
• Universities: registration, grades
• Sales: customers, products, purchases
• Online retailers: order tracking, customized recommendations
• Manufacturing: production, inventory, orders, supply chain
• Human resources: employee records, salaries, tax deductions
Purpose of Database Systems
• Need for Database systems arose in response to early methods of
computerized management of commercial data.
• One way to keep the information on a computer is to store it in operating
system files.
• To allow users to manipulate the information like
• Add new students, instructors and courses.
• Assign grades to students, compute grade point averages (GPA), and
generate transcripts

In the early days, database applications were built directly on top of file
systems
Purpose of Database Systems
• Drawbacks of using file systems to store data:
• Data redundancy (repetition of information)
• Data Inconsistency (Multiple copies of same data not updated)
• Difficulty in accessing data
• We should know the physical details of the file before accessing (like
Location, Name, Permissions, File Format, etc.)
• Data isolation — multiple files and schema
• Integrity problems are there in traditional file system.
• Searching is difficult – No Index to the records.
Purpose of Database Systems (Cont.)
• Drawbacks of using file systems (cont.)
• Concurrent access by multiple users
• Concurrent accessed needed for performance.
• Uncontrolled concurrent accesses can lead to inconsistencies
• Example: Two people reading a balance and updating it at the same time.
• Security problems
• Hard to provide user access to some, but not all, data.
• Database systems offer solutions to all the above problems
Components of DBMS
• The DBMS software is partitioned into several modules. Each module or
component is assigned a specific operation to perform.
• Some of the functions of the DBMS are supported by operating systems (OS) to
provide basic services.
• The physical data and system catalog are stored on a physical disk. Access to the
disk is controlled primarily by OS.
• The major software modules or components of DBMS are as follows:
• Query processor
• Run time database manager
• Data Manager
Query Processor
• It is used to interpret the online user query and converts into a form capable of
being send to the data manager for execution.
• The query processor use the data dictionary to find the structure of a database.
• It is a program module that provides the interface between the database and the
application programs/queries.
• The Query Processor Components include –
• Data Definition Language(DDL) Compiler(Create, Alter[Add, Drop, Modify], Drop, Describe)
• Data Manipulation Language (DML) compiler(Insert, Update, Select)
• Query evaluation engine
Data Dictionary
• A data dictionary is a reserved space within a database which is used to store
information about the database itself.
• A data dictionary is a set of table and views which can only be read and never
altered.
• The data dictionary also defines how much space has been allocated for and / or
currently in used by all the schema objects.
• A data dictionary is used when finding information about users, objects, schema
and storage structures.
• Every time a data definition language (DDL) statement is issued, the data
dictionary becomes modified.
Data Dictionary
• A data dictionary may contain information such as:

• Database design information


• Stored SQL procedures
• User permissions
• User statistics
• Database process information
• Database growth statistics
• Database performance statistics
Runtime Database Manager
• Run time database manager is the central software component of the DBMS.
• It handles database access at run time.
• It accepts queries and examines the external and conceptual schemas to
determine what conceptual records are required to satisfy the user’s request.
• It enforces constraints to maintain the consistency and integrity of the data, as
well as its security.
• It also performs backing and recovery operations.
Runtime Database Manager
• It has following components:
• Authorization control
• Command processor
• Integrity checker
• Query optimizer
• Transaction manager
• Scheduler
Runtime Database Manager
• Authorization control: The authorization control module checks the authorization of
users in terms of various privileges to users.
• Command processor: The command processor processes the queries passed by
authorization control module.
• Integrity checker: It .checks the integrity constraints so that only valid data can be
entered into the database.
• Query optimizer: The query optimizers determine an optimal strategy for the query
execution.
• Transaction manager: The transaction manager ensures that the
transaction properties should be maintained by the system.
• Scheduler: It provides an environment in which multiple users can work on same
piece of data at the same time in other words it supports concurrency.
Data Manager
• The data manager is responsible for the actual handling of data in the database.

• It provides recovery to the system which that system should be able to recover
the data after some failure.

• It includes Recovery manager and Buffer manager.

• The buffer manager is responsible for the transfer of data between the
main memory and secondary storage (such as disk or tape). It is also referred as
the cache manger.
INSTANCE & SCHEMA
• Databases change over time as information is inserted and deleted.
• The collection of information stored in the database at a particular
moment is called an instance of the database.
• The overall design of the database is called the database schema.
• Database systems have several schemas, partitioned according to the
levels of abstraction.
• The physical schema describes the database design at the physical
level, while the logical schema describes the database design at the
logical level.
THREE LEVEL DATABASE
ARCHITECTURE
• Data are actually stored as bits, or numbers and strings, but
it is difficult to work with data at this level.

• It is necessary to view data at different levels of abstraction.


• Physical Data Level
• Conceptual Data Level
• External Data Level/View Level.
THREE LEVEL DATABASE
ARCHITECTURE
View Level
What data users and
application programs
see ?
View 1 View 2 … View n

What data is stored ? Logical / Conceptual


describe data properties such as Level
data semantics, data relationships

How data is actually stored ?


Physical
e.g. are we using disks ? Which
file system ? Level
Physical Level in Database Architecture
• Knows how data is physically stored into the database. It knows the File
Format, Record Format, Index of file, Location of file, etc.
• It also typically describes the record layout of files and type of files (hash,
b-tree, flat).
• Early applications worked at this level - explicitly dealt with details.
• Problems:
• Changes to data structures are difficult to make.
• Application code becomes complex since it must deal with details.
• Rapid implementation of new features very difficult.
• Routines are hardcoded to deal with physical representation.
Conceptual Level in Database Architecture
• The next-higher level of abstraction describes what data are
stored in the database, and what relationships exist among those data.
• Also referred to as the Logical level.
• Hides details of the physical level.
• The DBMS maps data access between the conceptual to physical schemas
automatically.
• Represents:
• entities, attributes, relations
• constraints on data
• semantic information on data
• security, integrity information
View Level in Database Architecture
• The highest level of abstraction describes only part of the entire database.
• The user’s view of the database.
• Consists of a number of different external views of the DB.
• Provides a powerful and flexible security mechanism by hiding parts of the DB
from certain users. The user is not aware of the existence of any attributes
that are missing from the view.
• Examples:
• Students should not see faculty salaries.
• Faculty should not see billing or payment data.
Data Independence
Data Independence
• The ability to modify the schema
definition in one level should not affect
the schema definition in the next higher
level.
• Two types of Data Independence:
• Physical
• Logical
Logical Data Independence
What do you mean by logical data
independence?

The ability to change the logical schema without changing the


external schema or application programs is called as Logical Data
Independence.
OR
The ability to change the logical schema without having to change
the external schema.
Examples

The name field in conceptual view is stored as first


name, middle name and last name whereas in external
view, it remains to be as a single name field.
Physical Data Independence
What do you mean by Physical Data
Independence?
• The ability to change the physical schema without changing the
logical schema is called as Physical Data Independence.
• Modifications at this level are usually to improve performance
• Changes in the physical schema may include.
• Using new storage devices.
• Using different data structures.
• Switching from one access method to another.
• Using different file organizations or storage structures.
• Modifying indexes.
Data Models

A database model defines the logical design of data. The model also
describes the relationships between different parts of the data. In the
history of database design, three models have been in use: the
hierarchical model, the network model and the relational model.
Data Models
• A model is a representation of reality, ‘real world’ objects and
events and their associations.
• A model is a collection of tools for describing
• Data
• Data relationships
• Data semantics
• Data constraints
• Various models falls into three different categories:
• Object-based data models
• Record Based Data Models
• Physical Data Models
Object Based Data Models
• Use the concept of entities, attributes and relationships.
• Used to describe data at the logical and view level.
• It provides flexible structuring capabilities.
• The most widely known ones are:
• The Entity-Relationship Model
• The Object-Oriented Model.
Entity-Relationship Model
• It consists of basic objects called “Entities”, and “Relationships” among
these objects.
• An entity is a thing or object in the real world that is distinguishable
from other objects.
• Entities are described by set of attributes.
• Entities are the principal data object about which info is to be collected.
• A relationship is association among several entities.
Entity-Relationship Model
• Basic constructs of ER Model
• Entities
• Entity set (Set of entities of same type that share the same properties)
• Relationships
• Attributes (Describe the properties of entity)
• Simple Attribute
• Compound/Composite Attribute
• Derived Attribute
• Multivalued Attribute
• Degree of relationship
• No. of entities associated with the relationship (n-ary)
• Cardinality (Mapping)
Entity-Relationship Model (Cardinality)
• One to one cardinality (One student has one Address/Contact No.)
• One to many cardinality (One student registered in many courses)
(A department has many employees)
• Many to one cardinality (Many employees are working in one dept.)
• Many to Many cardinality (Employees can be assigned to more than
two projects at a time)
Entity-Relationship Model
Object Oriented Data Model
• Objects that contains the same types of values and the same methods
are grouped together into classes.

• Like ER model OODM is based on collection of objects.

• Object contains values stored in instance variable within the object.


Record Based Data Models
• It is used to describe data at the logical and view level.

• Record based models are so named because the database is structured


in fixed format records of several types.
• Relational Model
• Network Model
• Hierarchal Model
Relational Model

• Relational data model is the primary data model, which is used widely
around the world for data storage and processing.
• It stores data in the form of tables.
• This concept was proposed by E.F. Codd, a researcher of IBM in year
1960s.
• It uses a collection of tables to represent both data and relationship among
the data.
• Each table has multiple columns and each column has unique name.
Relational Model
Basic Terminology used in Relational
Model
• Tuple of a relation (Each row is called a tuple)
• Cardinality of a relation (No. of tuples in a relation)
• Degree of a relation (No. of attributes in a relation)
• Domain (Set of all possible values that an attribute my validly contain)
Hierarchical database model

• Organizes data in a tree structure (parent and child data).


• Each entity has only one parent but can have several children.
• At the top of the hierarchy, there is one entity, which is called the root.
• 1:N mapping between record types.
Hierarchical database model
• Advantages:
• Simplicity (Conceptually Simple)
• Data security
• Data Integrity
• Efficiency (Speed of access is faster because of the predefined data paths)
• Disadvantages:
• implement complexity
• database management problem
• programming complexity
• implementation limitation
Hierarchical Database Model
Network database model
• In the network model, the entities are organized in a graph, in which
some entities can be accessed through several paths.
• Some data modeled with more than one parent per child.
• Permits modeling of many-to-many relationships in data.
• It is represented by collection of records and relationship among data
is represented by links..
Network database model
• Advantages:
• Easy to access data
• Can handle more relationship types
• Conceptually simple
• Data Integrity

• Disadvantages
• System complexity (Pointers)
• Operational Anomalies (Change in any record require large number of pointers
adjustments)
• Absence of structural independence.
Relational database model
In the relational model, data is organized in two-dimensional tables
called relations.
Database Languages
• DDL
• DML
• DCL
Database Constraint
Introduction
• Constraints are the rules enforced on data columns on table.
These are used to limit the type of data that can go into a
table. This ensures the accuracy and reliability of the data in
the database.
• Constraints could be column level or table level.
• Column level constraints are applied only to one column,
whereas table level constraints are applied to the whole
table.
Types of Constraints
• Following are commonly used constraints available in SQL:
• NOT NULL Constraint: Ensures that a column cannot have NULL value.
• DEFAULT Constraint: Provides a default value for a column when none is specified.

• UNIQUE Constraint: Ensures that all values in a column are different. It can be a
NULLL.
• PRIMARY Key: Uniquely identified each rows/records in a database table.
• FOREIGN Key: Uniquely identified a rows/records in any another database table.
• CHECK Constraint: The CHECK constraint ensures that all values in a column satisfy
certain conditions.
NULL Value Constraint
• By default, a column can hold NULL values. If you do not want a column to have a
NULL value, then you need to define such constraint on this column specifying
that NULL is now not allowed for that column.
• A NULL is not the same as no data, rather, it represents unknown data.
DEFAULT constraint
• The DEFAULT constraint provides a default value to a column when
the INSERT INTO statement does not provide a specific value.
UNIQUE constraint
The UNIQUE Constraint prevents two records from having identical values in
a particular column. In the CUSTOMERS table, for example, you might want
to prevent two or more people from having identical age.

If CUSTOMERS table has already been created, then to add a UNIQUE constraint to AGE
column, you would write a statement similar to the following:
ALTER TABLE CUSTOMERS MODIFY AGE INT NOT NULL UNIQUE;
PRIMARY Constraints
• A primary key is a field in a table which uniquely identifies each row/record in a database table.
Primary keys must contain unique values. A primary key column cannot have NULL values.
• A table can have only one primary key, which may consist of single or multiple fields. When
multiple fields are used as a primary key, they are called a composite key.
• If a table has a primary key defined on any field(s), then you can not have two records having the
same value of that field(s).
Foreign Key Constraint
• A foreign key is a key used to link two tables together. This is sometimes called a
referencing key.
• Foreign Key is a column or a combination of columns whose values match a
Primary Key in a different table.
• The relationship between 2 tables matches the Primary Key in one of the tables
with a Foreign Key in the second table.
Foreign Key Constraint
CHECK Constraint
• The CHECK Constraint enables a condition to check the value being entered into a
record. If the condition evaluates to false, the record violates the constraint and
isn't entered into the table.

You might also like