Disadvantages of File Processing System
Disadvantages of File Processing System
Disadvantages of File Processing System
programs.
The simplest data retrieval task from file require extensive programming. Also this is
a time consuming and a high skill activity.
To access the data in file the programmer must aware of the physical structure of
the file.
Security features such as effective password protection, locking parts of file etc are
very difficult to program.
The File system exhibits structural dependence. That is a change in file structure
such as addition or deletion of a field require the modification of all programs using
that file.
Data dependence: A change in file data characteristic such as change in a field data
type from integer to decimal, requires changes in all programs that access the file.
A typical file processing system is supported by conventional operating systems. The system
stores permanent record in various files. It uses various application programs to extract
records from, and add records to the appropriate files. Before using DBMS to store and
retrieve data, organizations stored information in file processing systems.
But as the number of files in the system expands, system administration becomes difficult
too. Each file must have its own file management system, composed of programs that allow
user to create the file structure, add data to the file, delete data from the file, modify the
data in the file, list the file contents etc.
Even a simple file processing system containing 25 files requires 5 * 25 =125 file
management programs. Each department in the organization owns its data by creating its
own files. So the number of files can multiply rapidly.
Security features such as effective password protection, locking out part of files or part of
system itself and other data confidentiality measures are difficult to program and are
usually omitted.
The file system’s structure and lack of security makes it difficult to pool data. The same
basic data is stored in different locations. But it is very unlikely that that data stored in
different locations will always be updated consistently, hence maintaining different versions
of same data. The file processing system is simply not suitable for modern data
management and information requirement.
Data Inconsistency
Data Inconsistency means different copies of the same data are not
matching. That means different versions of same basic data are existing.
This occurs as the result of update operations that are not updating the
same data stored at different places.
Data Isolation
Data are scattered in various files, and the files may be in different format,
writing new application program to retrieve data is difficult.
Integrity Problems
The data values may need to satisfy some integrity constraints. For
example the balance field Value must be grater than 5000. We have to
handle this through program code in file processing systems. But in
database we can declare the integrity constraints along with definition itself.
Atomicity Problem
It is difficult to ensure atomicity in file processing system.For example
transferring $100 from Account A to account B.If a failure occurs during
execution there could be situation like $100 is deducted from Account A and
not credited in Account B.
The term database system refers to an organization of components that define and regulate
the collection, storage, management, and use of data with in a database environment. In a
high level view the database system is composed of the following five major parts.
Hardware
Software
People
Procedures
Data
Hardware identifies all the system's physical devices. It includes computers, computer
peripherals, network components etc.
Software refers to the collection of programs used with in the database system. It includes
the operating system, DBMS Software, and application programs and utilities.
Operating System
DBMS Software
Application Programs and Utilities
The operating System manages all the hardware components and makes it possible for all
other software to run on the computers. UNIX, LINUX, Microsoft Windows etc are the
popular operating systems used in database environment.
DBMS software manages the database with in the database system. Oracle Corporation's
ORACLE, IBM's DB2, Sun's MYSQL, Microsoft's MS Access and SQL Server etc are the
popular DBMS (RDBMS) software used in the database environment.
Application programs and utilities software are used to access and manipulate the data in
the database and to manage the operating environment of the database.
System Administrators
Data Modelers
Database Administrators
System Analysts and Programmers
End Users
Procedures are the instructions and business rules that govern the design and use of the
database system.
Data are the very important basic entity in a database. It is the collection of facts stored in
the database.
DBMS Functions:
The data dictionary stores the definitions of data elements and their relationships.This
information is termed as metadata.The metadata includes definition of data, data types,
relationship between data, integrity constraints etc. Any changes made in a database
structure are automatically reflected in the data dictionary. In short the DBMS provides data
abstraction and it removes structural and data dependency from the system.
The DBMS creates the complex structures required for data storage. The users are freed
from defining,programming and implementing the complex physical data characteristics.
DBMS supports data independence.Hence the DBMS translate logical request into
commands that physically locate and retrieve the requested data. The DBMS formats the
physically retrieved data according to the logical data format specifications.
Security Management
The DBMS creates a security system that enforces user security and data privacy within the
database. Security rules determine the access rights of the users. Read/write access is
given to the user is specified using access rights.
Data Independence
A major purpose of a database system is to provide the users with an abstract view of data.
To hide the complexity from users database apply different levels of abstraction. The
following are different levels of abstraction.
Physical Level
Logical Level
View Level
Physical Level
Physical Level is the lowest level of abstraction and it defines the storage structure.The
physical level describes complex low level data structures in detail.The database system
hides many of the lowest level storage details from the database programmers. Database
Administrators may be aware of certain details of physical organization of data.
Logical Level
This is the next higher level of abstraction which describe what data are stored in database,
relation between data, types of data etc . Database programmers, DBA etc knows the
logical structure of data
View Level
This the highest level of abstraction. It provides different view to different users. At the view
level users see a set of application programs that hide details of data types. The details such
as data type etc are not available at this level. Only view or Access is given to a part of data
according to the users access right
Database Languages
Data manipulation is
Retrieval of Information Stored in Database
Insertion of Information to the database
Deletion of information from the database
Updating of information stored in the databas
Procedural DML:
In procedural Data manipulation language user has to specify what data are needed and
how to get it.
A database System is divided into modules based on their function. The functional
components of a database system can be broadly divided into the storage manager and the
query processor components.
Storage Manager
Query Processor
Storage Manager
The storage manager is important because database typically require a large amount of
storage space.So it is very important efficient use of storage, and to minimize the
movement of data to and from disk .
A storage manager is a program module that provides the interface between the low-level
data stored in the database and the application programs and the queries submitted to the
system. The Storage manager is responsible for the interaction with the file manager. The
Storage manager translates the various DML statements into low level file system
commands. Thus the storage manager is responsible for storing, retrieving, and updating
data in the database.The storage manager components include the following.
Authorization and Integrity Manger tests for the satisfaction of integrity constraints and
checks the authority of users to access data. Transaction manager ensures that the
database remains in a consistent state and allowing concurrent transactions to proceed
without conflicting.The file manager manages the allocation of space on disk storage and
the data structures used to represent information stored on disk. The Buffer manager is
responsible for fetching the data from from disk storage into main memory and deciding
what data to cache in main memory.
The storage manager implements the following data structures as part of the physical
system implementation.Data File, Data Dictionary, Indices.Data files stores the database
itself. The Data dictionary stores meta data about the structure of database, in particular
the schema of the database. Indices provide fast access to data items.
The Query Processor simplifies and facilitates access to data. The Query processor include
the following component.
DDL Interpreter
DML Compiler
Query Evaluation Engine
The DDL interpreter interprets DDL statements and record the definition in the data
dictionary. The DML compiler translate DML statements in a query language into an
evaluation plan consisting of low-level instructions that the query evaluation engine
understands. The DML compiler also performs query optimization, that is it picks the lowest
cost evaluation plan from among the alternatives. Query evaluation engine executes low
level instructions generated by the DML compiler.
One of the main reasons for using DBMS is to have a central control of both data and the
programs accessing those data. A person who has such control over the system is called a
Database Administrator(DBA). The following are the functions of a Database administrator
Schema Definition
Storage structure and access method definition
Schema and physical organization modification.
Granting authorization for data access.
Routine Maintenance
Schema Definition
The Database Administrator creates the database schema by executing DDL statements.
Schema includes the logical structure of database table(Relation) like data types of
attributes,length of attributes,integrity constraints etc.
Routine Maintenance
Some of the routine maintenance activities of a DBA is given below.
The process of database design is an iterative process. The ER Model is used at the
conceptual design stage of the database design.ER diagram is used to represent this
conceptual design. The requirement analysis is modeled in this conceptual design. The ER
model is very expressive so that people can easily understand the requirement. The data
modeler prepares the ER diagram and is verified with the functional domain experts to
ensure that all the requirements are properly incorporated in the conceptual design. The
process is repeated until the end users and designers agree that the E-R diagram is a fair
representation of the requirement. We can easily map an ER diagram to a relational
schema.
The basic constructs of ER Model are Entity, Attributes and Relationships. An Entity is an
object that exist in the real world and is distinguishable from other entities.
Conceptual Design
ER Diagram
Entities
Attributes
Relationships
Entities
An entity is a thing or object in the real world that is distinguishable from all other objects.
For example 'Person' in an organization is an entity. An entity has a set of properties. At
the E-R modeling level an entity actually refers to an entity set. In other words, entity in ER
model corresponds to a table.
Entity
Entity Set
An entity may be concrete such as a person, book etc or may be abstract such as
account,loan etc. The ER model refers to a specific table row as an entity instance or entity
occurrence. Collection of similar entities (Entity Set) often corresponds to a table. Each
entity set has a key.All entities in an entity set have the same set of attributes. Thus entity
set is a set of entities of the same type that share the same properties or attributes. An
entity is represented by a rectangle containing the entity name, which is a noun usually
written in capital letters.
Attributes
An entity is represented by a set of attributes. It corresponds to a field in a table.For each
attribute there is a set of permitted values called the domain or value set of the attribute.
Attributes are represented by ovals and are connected to the entity with a line. Each oval
contains the name of the attribute it represents.
Relationships
Attribute Types
In Entity Relationship(ER) Model attributes can be classified into the following types.
Simple attribute consists of a single atomic value. A simple attribute cannot be subdivided.
For example the attributes age, sex etc are simple attributes.
A composite attribute is an attribute that can be further subdivided. For example the
attribute ADDRESS can be subdivided into street, city, state, and zip code.
A single valued attribute can have only a single value. For example a person can have only
one 'date of birth', 'age' etc. That is a single valued attributes can have only single value.
But it can be simple or composite attribute.That is 'date of birth' is a composite attribute ,
'age' is a simple attribute. But both are single valued attributes.
Multivalued attributes can have multiple values. For instance a person may have multiple
phone numbers,multiple degrees etc.Multivalued attributes are shown by a double line
connecting to the entity in the ER diagram.
Single Valued Attribute: Attribute that hold a single value
Example1: Age
Exampe2: City
Example3:Customer id
The value for the derived attribute is derived from the stored attribute. For example 'Date of
birth' of a person is a stored attribute. The value for the attribute 'AGE' can be derived by
subtracting the 'Date of Birth'(DOB) from the current date. Stored attribute supplies a value
to the related attribute.
Complex Attribute
Candidate Keys
Candidate Keys are super keys for which no proper subset is a super key. In other words
candidate keys are minimal super keys.
Primary Key:
It is a candidate key that is chosen by the database designer to identify entities with in an
entity set. Primary key is the minimal super keys. In the ER diagram primary key is
represented by underlining the primary key attribute. Ideally a primary key is composed of
only a single attribute. But it is possible to have a primary key composed of more than one
attribute.
Composite Key
Composite key consists of more than one attributes.
Example: Consider a Relation or Table R1. Let A,B,C,D,E are the attributes of this relation.
R(A,B,C,D,E)
A→BCDE This means the attribute 'A' uniquely determines the other attributes B,C,D,E.
BC→ADE This means the attributes 'BC' jointly determines all the other attributes A,D,E in
the relation.
Primary Key :A
Candidate Keys :A, BC
Super Keys : A,BC,ABC,AD
ABC,AD are not Candidate Keys since both are not minimal super keys.
Entity Types
The Entiy Relationship (ER) model consists of different types of entities. The existence of an
entity may depends on the existence of one or more other entities, such an entity is said to
be existence dependent.Entities whose existence not depending on any other entities is
termed as not existence dependent.
Strong Entities
Weak Entities
Recursive Entities
Composite Entities
A weak entity is existence dependent. That is the existence of a weak entity depends on the
existence of an identifying entity set.
The discriminator (or partial key) is used to identify other attributes of a weak entity set.
The primary key of a weak entity set is formed by primary key of identifying entity set and
the discriminator of weak entity set.
We underline the discriminator of a weak entity set with a dashed line in the ER diagram.
Recursive Entity
A recursive entity is one in which a relation can exist between occurrences of the same
entity set. This occurs in a unary relationship.
Composite Entities
If a Many to Many relationship exist we must create a bridge entity to convert it into 1 to
Many. Bridge entity composed of the primary keys of each of the entities to be connected.
The bridge entity is known as a composite entity. A composite entity is represented by a
diamond shape with in a rectangle in an ER Diagram.
ER Diagram Symbols
Entity Relationship Diagram(ER Diagram) is used to represent the requirement analysis at
the conceptual design stage .The database is designed from the ER Diagram or we can say
that the ER digram is converted to the database.
Normalization
Un-Normalized Form
First Normal Form (1 NF)
Second Normal Form (2 NF)
Third Normal Form (3 NF)
Boyce – Codd Normal Form (BCNF)
Fourth Normal From (4 NF)
Fifth Normal Form (5 NF)
Un-Normalized Form
Un-Normalized relation contain non atomic values.Each row may contain multiple set of
values for some of the columns.These multiple values in a single row are called non atomic
value.
Problem with BCNF: Given a relation R , Functional Dependency F, BCNF may or may not
preserve all given functional dependencies.
Fourth Normal From
A Relation is in 4NF if it is in BCNF and has no multi valued dependency.
Transaction
The term transaction refers to a collection of operation that form a single logical unit of
work. A transaction T is a logical unit of database processing that includes one or more
database access operation.
Atomicity
Consistency
Isolation
Durability
Atomicity
A transaction must be atomic. Ie it ensure that either all operations of the transactions are
reflected properly in the database or none should
Consistency
If the database is in a consistent state before the execution of the transaction, the database
remains consistent after the execution of the transaction.
Example: Transaction T1 transfers $100 from Account A to Account B. Both Account A and
Account B contains $500 each before the transaction.
Transaction T1
Read (A)
A=A-100
Write (A)
Read (B)
B=B+10
Consistency Constraint
Before Transaction execution Sum = A + B
Sum = 500 + 500
Sum = 1000
Isolation
When multiple transactions are executing concurrently, then each transaction is unaware of
other transactions executing concurrently in the system. Ie the execution of one transaction
must not interfere with another.
Durability
Changes applied to a database by a committed transaction must be made permanent even
if the system fails.
Query Processor
Transaction Manager
Recovery Manager