PAPER2 - Option A - DATABASE - NOTES

Paper 2
Option A - Database
Paper-2 Option A Databases
Paper 2 - Option A – Databases

UNIT A.1 BASIC CONCEPTS
A.1.1 Outline the differences between data and information.
A.1.2 Outline the differences between an information system and a database.
A.1.3 Discuss the need for databases.
A.1.4 Describe the use of transactions, states and updates to maintain data consistency (and integrity).
A.1.5 Define the term database transaction.
A.1.6 Explain concurrency in a data sharing situation.
A.1.7 Explain the importance of the ACID properties of a database transaction.
A.1.8 Describe the two functions databases require to be performed on them.
A.1.9 Explain the role of data validation and data verification.
BASIC CONCEPTS OF DATA SYSTEMS

INTRODUCTION
As its name suggests a database is a base for storing data. With many advantages over the old paper
system of storing data, Databases save space(physical storage space, compared to paper records), allow
multiple people to access the same data at the same time and queries (similar to filters) can be done to
only show the data required.
A database is system that allows us to store data in a structured way using tables and fields, and gives
us various means of access to the data.
2
A.1.1 OUTLINE THE DIFFERENCES BETWEEN DATA AND INFORMATION

What is the difference between a data and information?
Data is a collection of facts that are meaningless on their own whereas information puts data into clear
understandable context.
What is the difference between a database and a spreadsheet?

Yes, when you look at a database it might look very similar to a spreadsheet. Whilst spreadsheets may
primarily be used to manipulate data using functions and formula to perform calculations and statistics,
whereas databases are primarily used to store data and often have relationships between tables and
should allow the user to easily generate queries to view specific data. Databases are often containing
much more data than a spreadsheet.
3
A.1.2 OUTLINE THE DIFFERENCES BETWEEN AN INFORMATION SYSTEM AND A

DATABASE.
What is the difference between a database and an information system?
A database may form part of the backend of an information system.
‘An Information system (IS) is a formal, sociotechnical, organizational system designed to collect,
process, store, and distribute information. In a sociotechnical perspective, information systems are
composed by four components: task, people, structure (or roles), and technology'
A database will contain data that is used by information systems where as the information system
comprises of the complete system, furthermore may present the data from the database in a way in
which it becomes information.
A.1.3 THE NEED FOR DATABASES.
Databases are essential for managing large amounts of data efficiently and effectively. Here are some
reasons why databases are needed:
● Data organisation: Databases provide a way to organise data in a structured manner, making it
easier to store, retrieve, and manipulate data. Without a database, data would be stored in
individual files, which would make it difficult to manage and access.
● Data integrity: Databases ensure data integrity by providing mechanisms to ensure that data is
accurate and consistent. This is important when multiple users or applications need to access the
same data. Without a database, it would be difficult to maintain data consistency and accuracy.
● Data security: Databases provide a secure way to store data by allowing administrators to
control access to data. This helps protect sensitive data from unauthorised access, ensuring that
only authorised users can access the data.
● Scalability: Databases are designed to handle large amounts of data, making them a scalable
solution for organisations that need to store and manage large volumes of data.
● Performance: Databases are optimised for performance, allowing users to access and
manipulate data quickly and efficiently. This is especially important for applications that need
to process large amounts of data quickly.
● Data sharing: Databases enable data sharing among different applications and users, making it
easier for teams to collaborate and share data across different systems and applications.
Databases are essential for managing data effectively, ensuring data integrity, security, and scalability,
and optimizing performance.
4
A.1.4 TRANSACTIONS, STATES AND UPDATES

Transactions, states, and updates are important concepts in database management that are used to
maintain data consistency and integrity.
● Transactions: A transaction is a logical unit of work that consists of one or more database
operations that must be executed together as a single, atomic unit. Transactions ensure that
either all of the operations are completed successfully or none of them are completed at all. This
helps maintain data consistency by ensuring that the database remains in a consistent state,
even in the event of errors or system failures.
● States: States refer to the condition of the database at any given time. Database management
systems use states to keep track of changes to the database, including additions, updates, and
deletions. The current state of the database is often referred to as the database's current state,
and it is important to maintain consistency across different states to ensure data integrity.
● Updates: Updates refer to changes made to the database, including additions, updates, and
deletions. Database management systems use update operations to make changes to the database
while maintaining data consistency and integrity. This is accomplished through the use of
locking mechanisms, which ensure that only one user can make changes to a particular record at
a time. This helps prevent conflicts and inconsistencies in the data.
By using transactions, states, and updates, database management systems can ensure data consistency
and integrity by ensuring that changes to the database are made in a controlled and consistent manner.
This helps prevent errors, conflicts, and inconsistencies in the data, which can lead to problems with
data quality and reliability.
5
A.1.5 WHAT IS A DATABASE TRANSACTION?

A database transaction is a logical unit of work that involves one or more database operations, such
as insert, update, or delete. A transaction ensures that all of the operations in the unit are executed
together as a single, atomic operation, which means that either all of the operations are completed
successfully, or none of them are completed at all.
In other words, a transaction is a sequence of database operations that are executed as a single unit of
work. Transactions are used to ensure data consistency and integrity by ensuring that the database
remains in a consistent state, even in the event of errors, system failures, or other problems.
A.1.6 WHAT IS CONCURRENCY
Concurrency in a data sharing situation refers to the ability of multiple users or applications to access
and manipulate the same data simultaneously. In a shared data environment, concurrency can lead to
conflicts, inconsistencies, and other issues if not managed properly.
In database management, concurrency control is the process of managing concurrent access to data in
order to maintain data consistency and integrity. This involves implementing mechanisms to prevent
conflicts and inconsistencies that can arise when multiple users or applications attempt to access and
manipulate the same data simultaneously.
There are several techniques for managing concurrency in a data sharing situation, including:
● Locking: Locking involves the use of locks to control access to data. When a user or
application accesses a particular record, a lock is placed on that record, preventing other users or
applications from accessing or modifying it until the lock is released.
● Time-stamping: Time-stamping involves assigning a unique timestamp to each transaction that
accesses the database. If two transactions attempt to modify the same data, the system can use
the timestamps to determine which transaction should be given priority.
● Multi-version Concurrency Control (MVCC): MVCC involves creating multiple versions of
a data record to allow multiple users or applications to access and modify the same data
simultaneously. Each user or application sees a version of the data that reflects the state of the
database at the time the user or application began the transaction.
Concurrency is an important concept in data sharing situations, and it is essential to manage
concurrency effectively to maintain data consistency, integrity, and reliability.
6
Paper-2
Paper Option A Databases
A.1.7 EXPLAIN THE IMPORTANCE

PORTANCE OF THE ACID PROPERTIES
ES OF A DATABASE
TRANSACTION
Maintain integrity constraints defined on the database schema.

schem Prevent concurrent
transaction anomalies like dirty, non
non-repeatable,
repeatable, and phantom reads. Provide reliable recoverability
from system crashes and database failures
failures.
A TYPICAL TRANSACTION HAS FOUR PROPERTIES , COMMONLY REFERRED TO AS ACID PROPERTIES:
1. ATOMICITY: A transaction is atomic, which means that either all of its operations are
executed successfully or none of them are executed at all. This ensures that the database
remains in a consistent state.
2. CONSISTENCY: A transaction ensures that the database remains in a consistent state before
and after it is executed.
3. ISOLATION: Transactions
ions are executed in isolation from one another, which means that the
changes made by one transaction are not visible to other transactions until they are completed.
4. DURABILITY: Once a transaction is completed, its changes are permanently stored in the
database,
abase, even in the event of a system failure or other problem.
Database transactions are an essential concept in database management and are used to ensure data
consistency, integrity, and reliability.
7
A.1.7 QUERIES AND UPDATES

Databases require two fundamental functions to be performed on them: query functions and update
functions.
1. Query Functions: Query functions are used to retrieve data from the database. These functions
allow users or applications to search for specific data or to retrieve a subset of data that meets
certain criteria. Common query functions include SELECT statements in SQL and find()
functions in NoSQL databases. Query functions allow users to perform various types of data
analysis and reporting, such as sorting, grouping, filtering, and aggregating data. They are
essential for retrieving data from the database and for generating reports and insights.
2. Update Functions: Update functions are used to modify the data in the database. These
functions allow users or applications to add, update, or delete data in the database. Common
update functions include INSERT, UPDATE, and DELETE statements in SQL and save()
and remove() functions in NoSQL databases Update functions are essential for maintaining the
accuracy and integrity of the data in the database. They allow users to make changes to the data,
such as correcting errors, updating records, or deleting obsolete data. Update functions must be
used carefully to ensure that data consistency and integrity are maintained.
8
SELECT STATEMENTS EXAMPLES

SELECT * FROM Customers;
// * Retrieves all field values
SELECT CustomerName,City FROM Customers;
9
SELECT DISTINCT Country FROM Customers;

Inside a table, a column often contains many duplicate values; and sometimes you only want to
list the different (distinct) values.
SELECT * FROM Customers

WHERE Country='Mexico';
SELECT * FROM Products

ORDER BY Price;
10
SELECT *
FROM Customers
WHERE Country = 'Spain' AND CustomerName LIKE 'G%';
SELECT *
FROM Customers
WHERE Country = 'Germany' OR Country = 'Spain';
SELECT * FROM Customers

WHERE NOT Country = 'Spain';
11
INSERT/UPDATE/DELETE STATEMENTS
The INSERT INTO statement is used to insert new records in a table.
Insert Data Only in Specified Columns:
INSERT INTO Customers(CustomerName,City,Country)

VALUES ('Cardinal','Stavanger','Norway');
Insert Multiple Rows:

INSERT INTO Customers (CustomerName, ContactName, Address, City, PostalCode,
Country) VALUES
('Cardinal', 'Tom B. Erichsen', 'Skagen 21', 'Stavanger', '4006', 'Norway'),
('Greasy Burger', 'Per Olsen', 'Gateveien 15', 'Sandnes', '4306', 'Norway'),
('Tasty Tee', 'Finn Egan', 'Streetroad 19B', 'Liverpool', 'L1 0AA', 'UK');
The UPDATE statement is used to modify the existing records in a table.

UPDATE Customers
SET ContactName = 'Alfred Schmidt', City= 'Frankfurt'
WHERE CustomerID = 1;
The DELETE statement is used to delete existing records in a table.
DELETE FROM Customers;
Databases require both query functions and update functions to be performed on them. Query functions
are used to retrieve data from the database and allow for data analysis and reporting. Update functions
are used to modify data in the database and ensure data accuracy and integrity. These two functions are
essential for managing data effectively in a database system.
12
A.1.9 DATA VALIDATION AND DATA VERIFICATION

Data validation and data verification are two important processes used to ensure the accuracy,
completeness, and consistency of data in a database system. Although the terms are sometimes used
interchangeably, they refer to different processes.
DATA VALIDATION:
Data validation is the process of checking whether the data entered into a system is accurate,
complete, and consistent with predefined rules and constraints. The purpose of data validation is
to ensure that the data entered into the system is correct and can be used reliably.
Data validation is typically performed when data is first entered into the system, and it involves
checking for errors, such as missing or invalid data, incorrect data types, or data that does not
conform to predefined rules and constraints. Data validation may be performed using automated
validation tools, such as regular expressions, or it may involve manual review and correction of
data.
13
Data Verification:
Data verification is the process of checking whether the data in the database is accurate,
complete, and consistent with the original source. The purpose of data verification is to ensure
that the data stored in the database is a true representation of the original data source.
Data verification is typically performed on a periodic basis, such as during data migrations or
when integrating data from multiple sources. It involves comparing the data in the database with
the original source to ensure that it is accurate and complete. Data verification may involve
manual checks, automated tools, or a combination of both.
Double Data entry (Reset

Passwor
d)
Data validation and data verification are both essential processes for ensuring the accuracy,
completeness, and consistency of data in a database system. Data validation checks the accuracy
and completeness of data when it is first entered into the system, while data verification checks
the accuracy and completeness of data stored in the database relative to the original source. By
performing both data validation and data verification, organizations can ensure that their data is
reliable, accurate, and useful.
14
UNIT A.2 THE RELATIONAL DATABASE MODEL

A.2.1 Define the terms: DBMS and RDBMS.
A.2.2 Outline the functions and tools of a DBMS.
A.2.3 Describe how a DBMS can be used to promote data security.
A.2.4 Define the term schema.
A.2.5 Identify the characteristics of the three levels of the schema: conceptual, logical, physical.
A.2.6 Outline the nature of the data dictionary.
A.2.7 Explain the importance of a data definition language in implementing a data model.
A.2.8 Explain the importance of data modelling in the design of a database.
A.2.9 Define the following database terms: table, record, field, primary key, secondary key, foreign
key, candidate key, composite primary key, join
A.2.10 Identify the different types of relationships
A.2.11 Outline the issues caused by redundant data.
A.2.12 Outline the importance of referential integrity in a normalized database.
A 2.13 Describe the differences between 1st Normal Form (1NF), (2NF) and 3rd Normal Form (3NF).
A.2.14 Describe the characteristics of a normalized database.
A.2.15 Evaluate the appropriateness of the different data types.
A.2.16 Construct an entity-relationship diagram (ERD) for a given scenario.
A.2.17 Construct a relational database to 3NF using objects such as tables, queries, forms, reports and
macros.
A.2.18 Explain how a query can provide a view of a database.
A.2.19 Describe the difference between a simple and complex query.
A.2.20 Outline the different methods that can be used to construct a query.
A.2.1 | DATABASE MANAGEMENT SYSTEMS (DBMS) A database management system

(DBMS) is software designed to store, manage, and retrieve data in a s tructured and organised manner.
The purpose of a DBMS is to provide a centralised, controlled, and efficient environment for managing
data, enabling organisations to store, access, and analyse large amounts of data in a consistent and
organised way.
15
RELATIONAL DATABASE MANAGEMENT SYSTEM (RDBMS)

A relational database management system (RDBMS) is a database management system (DBMS) that is
based on the relational model as introduced by E. F. Codd,
Relational databases have often replaced legacy databases and network databases because they are
easier to understand and use. Relational databases are powerful because they require few assumptions
about how data is related or how it will be extracted from the database. As a result, the same database
can be viewed in many different ways.
An important feature of relational systems is that a single database can be spread across several tables.
This differs from flat-file databases, in which each database is self-contained in a single table.
Almost all full-scale database systems are RDBMS's.
A.2.2 OUTLINE THE FUNCTIONS AND TOOLS OF A DBMS:
● Data organisation and management: A DBMS helps organisations to store and manage large
amounts of data in a structured and organised manner, making it easier to find and retrieve the
data as needed.
● Data security and privacy: A DBMS provides a controlled environment for managing data,
enabling organisations to enforce data security and privacy policies and ensure that sensitive
data is protected.
● Data consistency and integrity: A DBMS helps to ensure that the data stored in the database is
accurate, consistent, and up-to-date, improving the quality of the data and supporting better
decision making.
● Data sharing and collaboration: A DBMS enables multiple users and applications to access
and use the same data, improving collaboration and data sharing across the organisation.
● Data analysis and reporting: A DBMS provides tools and functions for data analysis and
reporting, enabling organisations to gain insights into their data and make informed decisions
based on that data.
The purpose of a DBMS is to provide a centralised, controlled, and efficient environment for managing
data, enabling organisations to store, access, and analyse large amounts of data in a consistent and
organised way.
16
A.2.3 SECURITY
A database management system (DBMS) can be used to promote data security in several ways. Here
are some examples:
Authentication and Access Control: A DBMS can provide authentication mechanisms to verify the
identity of users who access the system. It can also provide access control mechanisms to restrict access
to data and functions based on the user's role, privilege level, or other criteria. This helps to prevent
unauthorised access to sensitive data and functions.
● Encryption: A DBMS can support encryption mechanisms to protect data in transit and at rest.
Encryption can be used to ensure that data is transmitted securely over networks and stored
securely on disk or in memory. This helps to prevent data theft and unauthorised access to data.
● Audit Trail: A DBMS can maintain an audit trail of all activities that occur in the system. The
audit trail can record all changes to data, all login attempts, and other security-related events.
This can help to detect and investigate security breaches or other incidents.
● Backup and Recovery: A DBMS can support backup and recovery mechanisms to protect
against data loss or corruption. Backup mechanisms can be used to create copies of the database
at regular intervals, while recovery mechanisms can be used to restore the database to a
previous state in the event of a system failure, data loss, or other problems.
17
● Data Masking: A DBMS can support data masking techniques to protect sensitive data by
replacing it with fictitious data. This can be useful in situations where sensitive data is being
used for testing, training or other purposes where the original data is not required.
A DBMS can be used to promote data security by providing authentication and access control
mechanisms, encryption, audit trails, backup and recovery, data masking, and other security features.
By using these features, organizations can help to protect sensitive data, prevent unauthorised access,
and ensure the integrity and availability of their data.
A.2.4 SCHEMA
In database management, a schema refers to the logical structure of a database, which defines the
organization and relationships among the data elements or objects within the database.
 A schema can be thought of as a blueprint or plan for the database, which specifies the types
of data that can be stored in the database, the relationships between different types of data,
and the constraints or rules that govern the data.
A database schema typically consists of a set of tables, which represent the different entities or objects
within the database, along with their attributes or fields. The schema defines the structure of each table,
including the data types and constraints for each field, as well as any relationships between tables.
For example, a database schema for a customer database might include tables for customers, orders,
and products, along with fields for each table such as customer name, order date, and product price. The
schema would define the relationships between these tables, such as the fact that each order is
associated with a particular customer and product.
A schema is an important concept in database management, as it provides a logical framework for
organizing and managing data within a database. By defining the schema of a database, organisations
18
can ensure that the data is structured and organised in a way that supports their business needs and
objectives.
A.2.5 CHARACTERISTICS OF SCHEMA

Schema
The design of the database is called a schema. This tells us about the structural view of the database. It
gives us an overall description of the database. A database schema defines how the data is organised
using the schema diagram. A schema diagram is a diagram which contains entities and the attributes
that will define that schema. A schema diagram only shows us the database design. It does not show the
actual data of the database. Schema can be a single table or it can have more than one table which is
related. The schema represents the relationship between these tables.
Example: Let us suppose we have three tables Employee, Department and Project. So, we can
represent the schema of these three tables using the schema diagram as follows. In this schema
diagram, Employee and Department are related and the Employee and Project table are related.
There are three levels of the schema. The three levels of the database schema are defined according to
the three levels of data abstraction.
 External/View Schema
 Conceptual/ Logical Schema
 Internal/Physical Schema
19
Paper-2
Physical / Internal Level(or) Schema
The internal schema defines the physical storage structure of the database. The internal schema is a
very low-level
level representation of the entire database. It contains multiple occurrences of multiple types
of internal record. In the ANSI term, it is also called “stored record’.
 The internal schema is the lowest level of data abstraction

 It helps you to keeps information about the actual representation of the entire database. Like
the actual storage of the data on the disk in the form of records
 The internal view tells us what data is stored in the databas
databasee and how
 It never deals with the physical devices. Instead, internal schema views a physical device as
a collection of physical pages
20
Logical /Conceptual Schema(OR)Level
The conceptual schema describes the Database structure of the whole database for the community of
users. This schema hides information about the physical storage structures and focuses on describing
data types, entities, relationships, etc.
This logical level comes between the user level and physical storage view. However, there is only
single conceptual view of a single database.
 Defines all database entities, their attributes, and their relationships

 Security and integrity information
 In the conceptual level, the data available to a user must be contained in or derivable from the
physical level
View/External Schema(Or)Level
An external schema describes the part of the database which specific user is interested in. It hides the
unrelated details of the database from the user. There may be “n” number of external views for each
database.Each external view is defined using an external schema, which consists of definitions of
various types of external record of that specific view.
An external view is just the content of the database as it is seen by some specific particular user. For
example, a user from the sales department will see only sales related data.
 An external level is only related to the data which is viewed by specific end users.
 This level includes some external schemas.
 External schema level is nearest to the user
 The external schema describes the segment of the database which is needed for a certain user
group and hides the remaining details from the database from the specific user group
A.2.6 DATA DICTIONARY

In database management, a data dictionary (also known as a metadata repository or data catalog) is a
collection of metadata that provides information about the data in a database. The data dictionary serves
21
as a reference source for database administrators, developers, and users, and it provides a standardised
way to document the structure and contents of a database.
The nature of the data dictionary can vary depending on the specific database management system
being used, but it typically includes the following types of information:
● Data Element Descriptions: A data dictionary typically includes a description of each data
element or attribute used in the database, along with information such as the data type, length,
and format of the element.
● Table and Relationship Descriptions: A data dictionary may include descriptions of the tables
in the database, as well as the relationships between the tables. This information can help users
understand the structure of the database and the way data is organised within it.
● Business Rules and Constraints: A data dictionary may also include information about the
business rules and constraints that apply to the data in the database. This can include
information such as data validation rules, default values, and other constraints.
● Data Access Permissions: A data dictionary may also include information about the access
permissions that are required to view or modify data in the database. This can help to ensure
that data is accessed and used appropriately by authorized users.
● Database Management Information: A data dictionary may also include information about
the database management system itself, such as the version of the software being used, the
server configuration, and other technical details.
A data dictionary is a collection of metadata that provides a standardised way to document the structure
and contents of a database. It typically includes information about data elements, tables and
relationships, business rules and constraints, data access permissions, and other technical details related
22
to the database management system. By providing a centralised source of information about the
database, the data dictionary helps to ensure that data is managed effectively and used appropriately by
authorised users.
A.2.7 DATA DEFINITION LANGUAGE
A data definition language (DDL) is a set of commands or statements used to define and manipulate the
structure of a database. A DDL is used to create and modify tables, indexes, constraints, and other
database objects, and to specify the relationships between these objects. The importance of a DDL in
implementing a data model is as follows:
● Creating Tables and Relationships: The primary function of a DDL is to create the tables and
relationships that make up a database. The DDL specifies the structure and attributes of each
table, including the data types of each field, the constraints that apply to the fields, and the
relationships between tables. By using a DDL to define these elements, developers can ensure
that the data model is accurate and consistent.
● Enforcing Data Integrity: A DDL can also be used to specify constraints that ensure the
integrity of the data in the database. For example, a DDL can specify that a certain field must be
unique or that a field cannot contain null values. These constraints help to ensure that the data in
the database is accurate and consistent.
23
● Facilitating Database Management: A DDL can also be used to modify the structure of a
database as needed. For example, a DDL can be used to add new tables or fields to a database,
or to modify existing fields or relationships. This allows database administrators to manage the
database effectively and make changes as needed to accommodate changing business needs.
● Supporting Data Security: A DDL can also be used to specify access permissions for different
users or groups of users. By using a DDL to define these permissions, developers can ensure
that the data in the database is accessed and used appropriately by authorised users, and that
sensitive data is protected from unauthorised access.
DDL is an essential tool in implementing a data model, as it allows developers to define and manipulate
the structure of the database, enforce data integrity, facilitate database management, and support data
security. By using a DDL effectively, organisations can ensure that their databases are accurate,
consistent, and secure, and that they meet the needs of the business.
A.2.8 DATA MODELING

Data modeling is a critical step in the design of a database because it allows developers to create a
Visual representation/blueprint of the database structure and relationships between the data
elements. The importance of data modeling in the design of a database can be explained as follows:
Example:
● Data Consistency and Accuracy: A well-designed data model ensures data consistency and
accuracy. A data model defines the rules, constraints, and relationships that govern how data is
24
organised and stored in the database. By ensuring that data is organised consistently and
accurately, a data model reduces the risk of data inconsistencies and errors.
● Efficiency: A data model helps to improve the efficiency of a database by reducing data
redundancy and improving data retrieval speed. A data model helps to identify and eliminate
data redundancy, ensuring that data is stored only once in the database. This reduces storage
requirements and improves data retrieval speed.
● Flexibility: A well-designed data model is flexible and can adapt to changing business needs. A
data model can be updated and modified easily to accommodate new requirements or changing
business needs.
● Collaboration: A data model helps to facilitate collaboration between developers, database

administrators, and other stakeholders involved in the design of the database. A data model
provides a shared understanding of the database structure and relationships, which helps to
ensure that all stakeholders are on the same page.
● Maintainability: A data model helps to improve the maintainability of a database. A data

model provides a clear understanding of the database structure, which helps to ensure that
changes to the database can be made easily and without impacting other areas of the database.
The importance of data modeling in the design of a database cannot be overstated. A well-designed data
model ensures data consistency and accuracy, improves efficiency, flexibility, collaboration, and
maintainability. By creating a clear blueprint of the database structure and relationships, developers can
create a database that is well-organised, efficient, and flexible enough to meet changing business needs
A.2.9 | DATABASE TERMINOLOGY

Define the following database terms: table, record, field, primary key, secondary key, foreign key,
candidate key, composite primary key, join
25
● Table: A table is a collection of related data organised in rows and columns. Tables are used to
store data in a database and are often named based on the type of data they contain.
● Record: A record is a collection of data that represents a single entity in a table. A record is
also known as a row, and it typically contains information about a specific item or object, such
as a customer, order, or product.
● Field: A field is a single piece of data stored in a record. A field is also known as a column, and
it represents a specific attribute or characteristic of the entity represented by the record.
● Primary Key: A primary key is a field or combination of fields in a table that uniquely
identifies each record in the table. A primary key is used to enforce data integrity and ensure
that no two records in the table are identical.
● Secondary Key: A secondary key is a field or combination of fields in a table that is not the
primary key but can be used to access and query data in the table.
26
Paper-2
● Foreign Key: A foreign key is a field in a table that refers to the primary key of another table.
A foreign key is used to create a relationship between two tables and ensure data integrity
across the tables.
● Alternate Key
The candidate key other than the primary key is called an alternate key.
● A super key:
Super key is a group of single or multiple keys that identifies rows in a table. It supports NULL
values.
● Unique key:
Unique key in SQL is the set of fields or columns of a table that helps us uniquely identify
records. The unique key guarantees the uniqueness of the columns in the database. It is similar
to the primary key but can accept a null value.
● Candidate Key: A candidate key is a field or combination of fields in a table that could be used
as the primary key but is not currently used for that purpose. A candidate key is used to ensure
that no two records in the table are identical.
● Composite Primary Key

Key: A composite primary key is a primary key that consists
con of two or
more fields in a table. A composite primary key is used when a single field is not sufficient to
uniquely identify each record in the table.
● Join: A join is a database operation that combines data from two or more tables based on a
related field. A join is used to combine data from multiple tables into a single result set that can
be used for data analysis or reporting.
27
Paper-2
WHAT IS AN INNER JOIN

An inner join is a type of join operation in a database that combines data from two or more tables
based on a common field. An inner join returns only the rows from each table that have matching
values in the specified field, excluding any rows that do not have matching values.
Here is an example to illustrate how an inner join works:
28
Paper-2
Suppose you have two tables, a "Customers" table and an "Orders" table. The "Customers" table
contains information about each customer, such as their name and address, while the "Orders" table
contains information about each order, such as the order number and the customer who placed the
order. Both tables have a common field, such as a customer ID.
To perform an inner join between these two tables, you would specify the customer
cus ID field as the
common field. The inner join would then return only the rows from each table where there is a
matching customer ID, and exclude any rows where there is no matching customer ID.
29
For example, suppose the "Customers" table has a row with a customer ID of 123 and a name of "John
Smith", and the "Orders" table has a row with an order number of 456 and a customer ID of 123. When
you perform an inner join between these tables, the result set would contain only the row with customer
ID 123, and exclude any other rows where there is no matching customer ID.
In summary, an inner join is a type of join operation that combines data from two or more tables based
on a common field, returning only the rows that have matching values in the specified field. Inner joins
are commonly used in database management to combine data from multiple tables into a single result
set for data analysis or reporting.
A.2.10 | ENTITY RELATIONSHIP DIAGRAMS

An entity-relationship diagram (ERD) is a graphical representation of the relationships between
entities in a database. It is used to model the data and relationships that exist within a database, and is a
key tool in the database design process.
An ERD consists of entities, attributes, and relationships.
An entity is a person, place, thing, or event that is relevant to the database, and is represented by a
rectangle on the diagram.
An attribute is a characteristic or property of an entity, and is represented by an oval or ellipse.
30
Paper-2
Types of attribute
Simple Attributes:
Simple attributes are those attributes which cannot be divided further.
Example:
Composite Attributes-
Composite attributes are those attributes which are composed of many other simple attributes.
attributes
31
Paper-2
Single Valued Attributes-

Single valued attributes are those attributes which can take only one value for a given entity from an
entity set.
Multi Valued Attributes-

Multi valued attributes are those attributes which can take more than one value for a given entity
en from
an entity set.
Derived Attributes-
Derived attributes are those attributes which can be derived from other attribute(s).
32
Paper-2
Key attributes
Key attributes are those attributes which can identify an entity uniquely in an entity set.
RELATIONSHIP
A relationship is a connection between entities, and is represented by a line that connects the related
entities.
Key attributes are indicated by underlining the attribute label. For our computer company, each
employee is given an ID number for unique identification.
Cardinality ratios and participation

 1:1, read as “one-to-one”
 1:N, read as “one-to-many”
many” (equivalently, N:1, or “many
“many-to-one”)
 N:M (or N:N), read as “many
“many-to-many”
We show the cardinalities on our model next to the line connec
connecting
ting the relationship to the entity:
Participation is a closely related topic. An entity is said to have total participation in a relationship if
every instance of the entity must be matched with instances of the other entity in the relationship. Here
is an example - note that this is a second relationship between employee and factory:
factory
33
Putting it together
Below is a diagram incorporating the examples above, with some additional attributes to fill out the
entities:
ERDs are used to model complex databases, allowing developers to visualise the relationships between
entities and to identify any potential issues or inconsistencies in the design. They are often used in
conjunction with other tools, such as data flow diagrams and data dictionaries, to ensure that the
database is well-designed and meets the requirements of the stakeholders. ERDs can also be used to
communicate the design of the database to non-technical stakeholders, such as business analysts and
project managers, in a clear and understandable way.
34
:
A.2.11 | ISSUES WITH REDUNDANT DATA
Redundant data refers to data that is unnecessarily duplicated or repeated in a database. Redundant data
can cause a number of issues, including:
● Data Inconsistency: When data is stored redundantly, it is possible for different copies of the
same data to become inconsistent.
● Data Integrity: Redundant data can also compromise data integrity by making it more difficult
to maintain the accuracy and completeness of the data. When data is stored redundantly, it is
more difficult to ensure that all copies of the data are updated consistently and accurately.
● Storage Costs: Redundant data can also be costly in terms of storage space. When data is
duplicated unnecessarily, it takes up more space in the database, which can increase storage
costs and reduce system performance.
● Maintenance Costs: Redundant data can also increase the cost of maintaining and updating the
database. When data is stored redundantly, it requires additional effort to keep all copies of the
data up to date and accurate.
● Security Risks: Redundant data can also pose security risks by increasing the number of
potential attack points for malicious actors. If redundant data is not properly secured, it can be
more easily accessed and manipulated by unauthorised users.
Redundant data can cause a number of issues for a database, including data inconsistency,
compromised data integrity, increased storage and maintenance costs, and security risks. By
eliminating or minimising redundant data in a database, organisations can improve the accuracy and
consistency of their data, reduce storage and maintenance costs, and enhance data security.
A.2.12 REFERENTIAL INTEGRITY

Referential integrity is an important concept in database design, particularly in a normalised database.
 Ensures that the data in the database is accurate and consistent.
 Ensures that all related data is kept up to date.
Referential integrity also helps to ensure data integrity in the database by
 Preventing the insertion of invalid data into the database.
 Ensures that data is entered into the correct tables with the correct relationships, reducing
the risk of data errors and inconsistencies.
35
This ensures that the relationships between tables in a database are maintained, ensuring data accuracy
and consistency, data integrity, improved database performance, and enhanced data security. By
ensuring that data is accurate, consistent, and secure, referential integrity helps organisations to make
better decisions and operate more effectively.
A 2.13 NORMALISATION
The normalisation process is used to organise data in a database into tables and establish relationships
between them. The process involves several steps, each of which is designed to remove data
redundancies and dependencies. The three most commonly used normal forms are 1st Normal Form
(1NF), 2nd Normal Form (2NF), and 3rd Normal Form (3NF). Here are the differences between each
of these normal forms:
● 1st Normal Form (1NF): In 1NF, each table in a database contains only atomic values,
meaning that each column contains only a single value. This means that data is not stored in a
repeating group or array format, and each table has a primary key that uniquely identifies each
row.
○ For example, let’s say we have a table named “Students” that stores information about
students in a school. A table that is not in 1NF might look like this:
In this table, the Subject column contains multiple values separated by commas, which
violates the rule of atomic values. This table is not in 1NF. To bring this table to 1NF, we
would need to split the Subject column into multiple columns, one for each subject, and
repeat the student information for each subject taken.
36
This table is now in 1NF, since all the data is atomic, each cell contains only one value, and
there are no repeating groups of data.
1NF is the starting point for normalization, and to ensure the data integrity and consistency
it’s necessary to move to next normalization forms.
● 2nd Normal Form (2NF): In 2NF, the table must be in 1NF and each non-key column must be
functionally dependent on the entire primary key. This means that each non-key column must
be uniquely determined by the primary key, and cannot be determined by a subset of the
primary key.
For example, let’s say we have a table named “Orders” that stores information about customer
orders. A table that is not in 2NF might look like this:
To bring this table to 2NF, we need to separate the table into two separate tables: one for the
Orders and one for the Products.
37
Now, the Price column is dependent on the primary key of the Products table (Product) and the
Orders table has no partial dependencies. This design is now in 2NF.
2NF eliminates partial dependencies and improves the data integrity by reducing the data
anomalies. However, it’s not enough to ensure the data consistency and to avoid data anomalies,
so it’s necessary to move to the next normalization forms.
● 3rd Normal Form (3NF):

● Third Normal Form (3NF) builds upon the rules of Second Normal Form (2NF) by addressing
the issue of transitive dependencies. In 3NF, a table must not have any transitive dependencies.
A transitive dependency exists when a non-primary key column depends on another non-
primary key column, rather than on the primary key.
● To achieve 3NF, a table must already be in 2NF and all non-primary key columns must be
directly dependent on the primary key.
● For example, let’s say we have a table named “Employees” that stores information about
employees in a company. A table that is not in 3NF might look like this:
38
In this table, the Manager column depends on the Department column, and neither of them
depend on the primary key (EmployeeID). This table is not in 3NF because of the transitive
dependency between the Manager column and the Department column.
To bring this table to 3NF, we need to separate the table into two separate tables: one for the
Employees and one for the Departments.
Now, the Manager column is dependent on the primary key of the Departments table
(Department) and the Employees table has no transitive dependencies. This design is now in
3NF.
39
3NF eliminates transitive dependencies and improves the data integrity and consistency by
reducing the data anomalies. However, it’s still not enough to ensure the data consistency and to
avoid data anomalies, so it’s necessary to move to the next normalization forms like Boyce-
Codd Normal Form (BCNF) or Fourth Normal Form (4NF) in some cases.
● Boyce-Codd Normal Form (BCNF): A relation is in BCNF if and only if for every one of its
non-trivial functional dependencies X → Y, X is a superkey.
● Fourth Normal Form (4NF): A table is in 4NF if it is in BCNF and it has no multi-valued
dependencies.
● Fifth Normal Form (5NF): A relation is in 5NF if every non-trivial join dependency in R is
implied by the candidate keys of R.
To summarise, 1NF requires that each table contain only atomic values, 2NF requires that each
non-key column be functionally dependent on the entire primary key, and 3NF requires that all
non-key columns be independent of each other. These normal forms are used to ensure that the
data in a database is organised efficiently, and is free from data redundancies and dependencies.
A.2.14 | CHARACTERISTICS OF A NORMALISED DATABASE

Database normalisation is a process used to organise data in a database into tables and establish
relationships between them, with the aim of reducing data redundancy and ensuring data integrity. A
normalised database has the following characteristics:
● Minimal Data Redundancy: A normalised database minimises data redundancy by organising

data into tables and removing data that is repeated or duplicated unnecessarily. This helps to
reduce the size of the database and improve database performance.
● Consistent Data: A normalised database ensures that data is consistent across tables by
removing data redundancies and dependencies. This helps to improve data integrity and reduce
the risk of data inconsistencies.
● Reduced Update Anomalies: A normalised database reduces the risk of update anomalies by
ensuring that data is stored in the appropriate table and that each table contains only a single,
logically related category of data. This helps to ensure that updates to the data are made only
once and that the data remains consistent across the database.
40
● Increased Scalability: A normalised database is highly scalable, meaning that it can be easily
expanded or modified to accommodate new data or changing business needs. This is because
the database is organised into tables, which can be modified or added as needed without
affecting the rest of the database.
● Improved Query Performance: A normalised database often has better query performance
because data is organised into smaller, more manageable tables. This allows queries to be
processed more quickly and efficiently, resulting in faster data retrieval times.
● Simplified Maintenance: A normalised database is easier to maintain because data is organised
into tables, making it easier to identify and fix errors or inconsistencies in the data. This helps to
reduce the cost and effort required to maintain the database over time.
A normalised database is structured in accordance with the principles of database normalisation, with
the aim of reducing data redundancy and ensuring data integrity. A normalised database is characterised
by minimal data redundancy, consistent data, reduced update anomalies, increased scalability,
improved query performance, and simplified maintenance.
A.2.15| DATA TYPES

Evaluate the appropriateness of the different data types.
The term 'Data Type’ refers to the type of data used.
● For example it could be text, numbers, dates or time, Boolean (Yes/No), Currency or an object
such as an image or link.
● Each data type has its own data format, for example a data might be written DD/MM/YY or
MM/DD/YY.
● Before setting up a database you will need to decide on data types and data formats for each
field within your database.
● Once the data type and format for each field is set when you first create the database it should
not be changed and it will restrict the data that is allowed to be entered, this then helps to ensure
data integrity. Data integrity is the completeness, correctness or accuracy of data.
41
Paper-2
The list below shows some data types with example data formats.
Text: Two options within the Text data type are short text and long text. Short text is used for under
256 characters to be entered, as standard most databases as set to short text, and you would need to
specify long text at the set up stage if you want to change tthis.
Numbers: Numbers can normally be formatted as integers, decimal, scientific.
Boolean: Boolean fields are used when you want to enforce one of two options for example YES or
NO, ON or OFF, M or F, TRUE or FALSE, 1 or 0. Some database software may only allow 1 or 0 as
a
options in the boolean datatype selection.
Date/Time: Data and time are often combined, and after selecting this as a data type you select the
format. Time options may be 12 or 24 hour formats and then the format such as hh/mm/ss. Date option
normally include
clude the structure of the date such as DD/MM/YYYY.
42
Currency: Currency datatype will allow you to choose the currency used for example $ or £, it will
also allow you to specify the number of decimal places.
Object: An object would normally be something that you cannot enter via the keyboard such as music
or a picture, but you could also have items such as hyperlinks as objects.
A.2.16 Construct AN entity-relationship diagram (ERD) for a given scenario.
SCENARIO 1:
Marble Reading Book Stores (MRBS) is a chain of bookstores based in London. The stores want to
keep information about the books they sell the authors of the books and the publishers they work with.
The assumptions made when the database was created were:
o a publisher can publish books from one or more authors
o an author can write one or more books.
Three of the tables in the MRBS database are shown below:
CONSTRUCT AN ER DIAGRAM FOR THE ABOVE SCENARIO:
43
SCENARIO 2:
44
SCENARIO 3:
45
A.2.17 Construct a relational database to 3NF using objects such as tables, queries, forms,
reports and macros.
EXAMPLE 1:
46
Paper-2
EXAMPLE 2:
47
48
A.2.18 | QUERIES
A query is a request for data from a database. By executing a query, a user can retrieve and manipulate
data stored in the database. A query can also be used to provide a view of a database, allowing users to
see the data in a specific format that is customised to their needs. Queries can provide a view of a
database in some of the following ways:
● Selecting Fields: When creating a query, the user can select the fields they want to view from
the database. This allows them to focus on specific information that is relevant to their needs.
For example, a user might create a query that selects only the customer name, order number,
and date for all orders placed in the last month.
SELECT custome_name, order_number, date_ordered FROM orders
WHERE order_date >= '2024-02-01' AND order_date < '2024-03-01'
● Filtering Data: Queries can also be used to filter data based on specific criteria. This allows the
user to view only the data that meets their specific needs.
For example, a user might create a query that only shows orders from a specific region or orders
that contain a certain product.
SELECT * FROM orders
WHERE order_region =’ Northwest Alabama'
● Sorting Data: Queries can also be used to sort data in a specific way. This allows the user to
view the data in a way that makes sense for their needs.
For example, a user might create a query that sorts orders by order date, so that the most recent
orders appear at the top. DESC denotes Descending. ASC denotes Ascending.
ORDER BY order_date DESC
● Grouping Data: Queries can also be used to group data together based on specific criteria. This
allows the user to view data in a summarised format.
For example, a user might create a query that groups orders by region or by product category.
GROUP BY product_category
49
● Calculating Data: Queries can also be used to calculate data based on specific criteria. This
allows the user to view the data in a way that is useful for their needs.
For example, a user might create a query that calculates the average order size or total sales for
a specific period.
SELECT AVG(order_size ), SUM(sales) FROM orders

WHERE order_date >= '2024-02-01' AND order_date < '2024-03-01'
By combining these techniques, a query can provide a view of a database that is customised to the
user's needs. This allows the user to access the data they need in a way that is easy to understand and
use.
A.2.19 | SIMPLE QUERIES VS COMPLEX QUERIES

SIMPLE QUERIES:
A simple query is a basic request for data from a database, typically involving only a single table and a
small number of fields.
A simple query is usually straightforward and easy to understand, and can be created using simple
query languages or graphical user interfaces.
COMPLEX QUERIES
In contrast, a complex query is a more sophisticated request for data from a database, often involving
multiple tables and complex operations. A complex query can be used to retrieve data that meets
specific criteria or to perform advanced calculations or data manipulations.
Some key differences between simple and complex queries:

SIMPLE QUERIES COMPLEX QUERIES
Complexity: Simple queries are less complex Complexity: Complex queries, on the other hand,
than complex queries, typically involving only a involve multiple tables, complex operations, and
single table and a small number of fields. advanced functions
Purpose: Simple queries are often used to retrieve Purpose: Complex queries are used to perform
specific data from a database. advanced data manipulation and analysis
Performance: Simple queries are usually faster Performance: Complex queries, on the other
and more efficient than complex queries, as they hand, can be slower and more resource-intensive,
50
involve less data and processing. especially if they involve large amounts of data or
complex calculations.
Ease of Use: Simple queries are generally easier Ease of Use: Complex queries may require more
to create and understand than complex queries. advanced technical skills and knowledge
Simple queries can often be created using simple
query languages(SQL) or graphical user
interfaces(GUI) technical skills and knowledge
A.2.20 | CREATING A QUERY

Constructing a query involves creating a request for data from a database. The methods used to
construct a query can vary depending on the type of database, the query language being used, and the
specific requirements of the user. Although at IB level it is quite likely that a query will be done using
SQL, here are some different methods that can be used:
● Graphical User Interfaces (GUIs): Many database management systems (DBMS) provide
graphical user interfaces (GUIs) that allow users to create queries using a visual interface. Users
can select tables and fields, add filters and sorting criteria, and build complex queries using
drag-and-drop functionality.
● Query Languages: Query languages such as SQL (Structured Query Language) and LINQ
(Language-Integrated Query) can be used to construct queries. These languages provide a
syntax for creating queries that can be executed on a database. SQL is a standard language used
for creating and managing relational databases, while LINQ is a .NET Framework component
used to query collections and databases.
● Stored Procedures: A stored procedure is a set of precompiled SQL statements that can be
executed on a database. Stored procedures can be created to perform specific tasks or to retrieve
data that meets specific criteria. They can be called from applications or other stored procedures
to retrieve data from the database.
● Data Access Layers: Data access layers provide a way to abstract the database from the
application code. They provide a set of methods and functions that can be used to retrieve data
from the database. The data access layer can be used to create queries, and the results can be
returned to the application code for further processing.
51
● Object-Relational Mapping (ORM): ORM tools provide a way to map database tables to
object-oriented code. The ORM tool can be used to create queries and retrieve data from the
database. The results can be returned as objects that can be used by the application code.
● Web-Based Interfaces: Web-based interfaces can be used to create and execute queries from a
web browser. These interfaces can provide a simple way to access the database from anywhere
with an internet connection.
There are several methods that can be used to construct a query, including graphical user interfaces,
query languages, stored procedures, data access layers, object-relational mapping tools, and web-based
interfaces. The choice of method depends on the specific requirements of the user and the database
management system being used.
52
UNIT A.3 FURTHER ASPECTS OF DATABASE MANAGEMENT

A.3.1 Explain the role of a database administrator.
A.3.2 Explain how end-users can interact with a database.
A.3.3 Describe different methods of database recovery.
A.3.4 Outline how integrated database systems function.
A.3.5 Outline the use of databases in areas such as stock control, police records, health
records, employee data.
A.3.6 Suggest methods to ensure the privacy of the personal data and the responsibility
of those holding personal data not to sell or divulge it in any way.
A.3.7 Discuss the need for some databases to be open to interrogation by other parties
(police, government, etc).
A.3.8 Explain the difference between data matching and data mining.
A.3.1 | THE ROLE OF THE DATABASE ADMINISTRATOR
A database administrator (DBA) is responsible for the design, implementation, maintenance, and
management of an organisation's databases. The specific responsibilities of a DBA may vary depending
on the size and complexity of the organisation, but some common responsibilities include:
● Design and implementation: DBAs are responsible for designing and implementing the
database architecture, including the physical storage and organisation of the data, the logical
relationships between the data entities, and the security and access controls.
● Maintenance and performance tuning: DBAs are responsible for maintaining the databases
and ensuring their performance and availability. This includes monitoring performance metrics,
53
tuning the database for optimal performance, and performing regular backups and disaster
recovery operations.
Maintenance and Data security
Design and
implementation performance tuning
Monitoring and DBA User

troubleshooting management
Monitoring and Data modelling

troubleshooting and architecture
● Data security: DBAs are responsible for ensuring the security and privacy of the data stored in
the databases, including the implementation of access controls, data encryption, and other
security measures.
● User management: DBAs are responsible for managing user accounts and permissions,
ensuring that the appropriate access controls are in place to ensure the security and privacy of
the data.
● Data modelling and architecture: DBAs are responsible for defining the data models and
architecture that support the organisation's data requirements, ensuring that the data is organised
in a way that supports the organisation's goals and objectives.
● Monitoring and troubleshooting: DBAs are responsible for monitoring the databases and
troubleshooting any issues that arise, including performance bottlenecks, data integrity
problems, and security incidents.
● Training and support: DBAs may also be responsible for providing training and support to
other stakeholders, such as developers and end-users, to help them effectively use the databases
and understand the data stored in the databases.
The role of a DBA is critical to the success of an organisation, as the DBA is responsible for ensuring
the accuracy, security, and performance of the organisation's data. By fulfilling these responsibilities,
DBAs help organisations to make informed decisions and support their goals and objectives.
54
A.3.2 | HOW END USERS INTERACT WITH THE DATABASE

Database administrators, internal employees, and external customers all have different roles and
responsibilities within an organisation, and as a result, they may interact with a database in different
ways. Additionally, the interfaces provided to these groups may also differ, depending on their specific
needs and requirements.
Here are some examples of how these groups may interact with a database and the interfaces they may
be provided with:
● Database Administrators: Database administrators (DBAs) are responsible for managing the
database and ensuring its security, availability, and performance. They typically interact with
the database using specialized tools that are designed to manage and monitor databases. These
tools may include command-line interfaces, GUIs, or web-based interfaces that allow DBAs to
perform tasks such as creating backups, managing users and permissions, monitoring
performance, and tuning the database.
● Internal Employees: Internal employees may interact with a database in a variety of ways,
depending on their roles and responsibilities.
For example, sales representatives may use a web-based interface to access customer data and
create new orders, while managers may use reporting tools to generate reports and analyse data.
55
The interfaces provided to internal employees may be customised to their specific needs, and
may include features such as forms, dashboards, and reports that are tailored to their roles.
● External Customers: External customers may interact with a database using web-based
interfaces or mobile applications. These interfaces may allow customers to view their account
information, place orders, or track shipments. The interfaces provided to external customers are
typically designed to be easy to use and intuitive, with a focus on providing a positive user
experience.
Database administrators, internal employees, and external customers may interact with a database in
different ways, depending on their roles and responsibilities. The interfaces provided to these groups
may also differ, depending on their specific needs and requirements. The key is to provide an interface
that is customised to each group, and that makes it easy to perform the tasks they need to do in an
efficient and user-friendly manner.
A.3.3 | DATABASE RECOVERY

Database recovery refers to the process of restoring a database to a consistent state after a failure or an
error. There are different methods of database recovery, depending on the type of failure and and
recovery time required by the organisation. Here are some different methods of database recovery:
● System Log: A system log, also known as a transaction log or audit trail, is a record of all
changes made to a database.
○ It is used to track the history of database transactions, and to provide a way to recover
from errors or failures.
○ The system log records details such as the time and date of the transaction, the user who
made the change, and the type of change that was made.
○ By using a system log, it is possible to undo or redo changes that were made to the
database, which can be useful in recovering from errors or restoring the database to a
previous state.
56
Paper-2
● Deferred Update:
○ Deferred update is a technique used in database management to improve performance
and reduce the risk of data inconsistencies.
○ In a deferred update system, changes made to a database are not immediately
imme written to
disk, but are instead held in memory until a commit point is reached.
reached
○ Once the commit point is reached, all the changes are written to the database in a single
batch. This can reduce the overhead of writing to the disk for each individual
individua
transaction, and can improve performance.
○ However, it also means that data may not be immediately available for other
transactions, which can lead to concurrency issues.
● Mirroring:
○ Mirroring involves creating a duplicate copy of a database on a separate server.
○ In the event of a failure, the duplicate copy can be used to recover the database.
○ This method can be used to provide high availability and fast recovery times, but it can
be complex to set up and maintain.
A.3.4 | INTEGRATED DATABASE SYSTEM

An integrated database system, also known as an integrated data management system (IDMS), is a
system that provides a centralised, unified view of data from multiple sources.
It is designed to integrate data from various systems, applications, and databases, and to provide a
single point of access for users. Here's an outline of how integrated database systems function:
● Data Collection: The first

irst step in an integrated database system is data collection. Data is
collected from various sources, such as databases, applications, and file systems. This data is
then consolidated into a single database or data warehouse.
57
● Data Integration: The next step is data integration. Data from different sources is integrated
into a single, unified format. This involves transforming data from its original format into a
common format that can be easily accessed and analysed.
● Data Cleansing: Data cleansing involves identifying and correcting errors, inconsistencies, and
duplicates in the data. This is an important step in ensuring that the data is accurate, complete,
and consistent.
● Data Storage: Once the data has been collected, integrated, and cleansed, it is stored in a
centralised database or data warehouse. This database is optimised for querying and analysis,
and may use specialised storage technologies such as columnar storage or in-memory databases.
● Data Access: The final step is data access. Users can access the data in the integrated database
system using a variety of tools, such as SQL queries, data visualisation tools, or reporting tools.
The system may also provide APIs or web services for programmatic access.
An integrated database system is designed to provide a centralised, unified view of data from multiple
sources. It involves collecting data, integrating it into a common format, cleansing it, storing it in a
centralised database or data warehouse, and providing access to users through a variety of tools and
interfaces. The goal is to provide a single point of access for users and to ensure that the data is
accurate, complete, and consistent.
A.3.5 | EXAMPLE USE OF DATABASES

Databases are widely used in many different areas to store and manage data. Here's an outline of how
databases are used in specific areas:
● Stock Control: Databases are used in stock control systems to manage inventory levels and
track sales. The database contains information about each product, including its SKU,
description, price, and quantity on hand. When a sale is made, the database is updated to reflect
the change in inventory levels. Reports can be generated from the database to help with
forecasting, ordering, and reordering.
● Police Records: Databases are used in police records systems to store and manage information
about crimes, suspects, and victims. The database contains details such as the location and date
of the crime, the type of crime, and any evidence or witness statements. This information can be
used to identify patterns, track suspects, and solve crimes.
58
● Health Records: Databases are used in healthcare systems to store and manage patient health
records. The database contains information such as the patient's name, age, medical history,
diagnoses, medications, and test results. This information can be used by healthcare providers to
make informed decisions about treatment, monitor patient progress, and provide better care.
● Employee Data: Databases are used in human resources systems to store and manage employee
data. The database contains information such as the employee's name, address, contact
information, job title, salary, and benefits. This information can be used to manage payroll,
track performance, and provide benefits to employees.
Databases are used in many different areas to store and manage data, including stock control, police
records, health records, and employee data. The information stored in these databases can be used for a
variety of purposes, such as tracking inventory, solving crimes, providing healthcare, and managing
human resources.
A.3.6 | DATA PROTECTION

Ensuring the privacy of personal data is essential in any organisation that collects, stores, or processes
personal information.
Organisations must adhere to various data protection rules such as the Data Protection Act and
Computer Misuse Act, whilst law vary in different countries, most countries take the privacy of
personal data seriously.
Here are some methods to help keep data private from both a computer and human point of view:
TECHNOLOGY METHODS
● Data Encryption: Data encryption is a process of converting data into a coded form to protect
its confidentiality. Encryption can be used to protect data in transit or data at rest.
● Access Controls: Access controls are security measures that restrict access to data based on the
user's identity, role, or permissions. This can include password protection, multi-factor
authentication, and role-based access control.
● Secure Data Storage: Data should be stored in a secure location, such as a server room with
access control and surveillance cameras. In addition, data backups should be stored offsite in a
secure location.
● Regular Security Audits: Regular security audits should be conducted to ensure that data is
protected from unauthorised access, and that all security controls are functioning as intended.
59
HUMAN METHODS
● Employee Training: Employees should be trained on the importance of data privacy, as well as
the policies and procedures for protecting personal data. Training should include how to handle
personal data securely and how to detect and report any privacy breaches.
● Access Controls: Access controls should also be enforced for human users, including
restricting access to personal data on a need-to-know basis, and providing training on how to
handle sensitive data.
● Background Checks: Employees who have access to personal data should undergo background
checks to ensure that they have a trustworthy background.
● Privacy Policies and Notices: Privacy policies and notices should be created and made
available to all employees, customers, and partners. These should clearly state the organization's
privacy practices and procedures.
Ensuring privacy of personal data from both a computer and human point of view requires a
combination of technical and organizational measures. This includes data encryption, access controls,
secure data storage, regular security audits, employee training, access controls for human users,
background checks, and privacy policies and notices. By implementing these measures, organizations
can ensure the confidentiality, integrity, and availability of personal data.
A.3.7 | OPEN TO INTERROGATION

Third parties such as the police or medical service may need to interrogate database systems for a
variety of reasons, including:
● Criminal Investigations: Law enforcement agencies may need to interrogate database systems
to gather evidence in criminal investigations. This may involve accessing records of suspects,
victims, or witnesses.
● Medical Emergencies: Medical services may need to interrogate database systems to access
medical records in cases of emergency. This can help medical professionals make informed
decisions about treatment and care.
● Compliance and Regulations: Some industries, such as finance and healthcare, are subject to
strict regulations and compliance requirements. Third parties may need to interrogate database
systems to ensure that organisations are complying with these regulations.
60
It is important to balance the need for access with privacy and security concerns, and to ensure that
access to personal data is limited to those who have a legitimate need to know.
A.3.8 | DATA MATCHING AND DATA MINING

Data matching and data mining are two distinct techniques used in database management and analysis.
DATA MATCHING
o Data matching is a process of comparing two or more datasets to identify matches or
duplicates.
o It involves searching for records in one dataset that match records in another dataset, based
on certain criteria.
o Data matching is typically used for data integration, fraud detection, and identity
verification.
o For example, a bank may use data matching to identify duplicate records in its customer
database or to match customer records with government records to verify customer
identities.
DATA MINING
o Data mining is a process of analyzing large datasets to discover patterns and relationships.
o It involves using statistical and machine learning algorithms to analyse data and uncover
insights.
o Data mining is typically used for business intelligence, marketing, and scientific research.
o For example, a retailer may use data mining to analyse customer purchase patterns and identify
which products are frequently bought together.
61

PAPER2 - Option A - DATABASE - NOTES

Uploaded by

Copyright:

Available Formats

PAPER2 - Option A - DATABASE - NOTES

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PAPER2 - Option A - DATABASE - NOTES

Uploaded by

Copyright:

Available Formats

Paper 2

Paper 2 - Option A – Databases

BASIC CONCEPTS OF DATA SYSTEMS

A.1.1 OUTLINE THE DIFFERENCES BETWEEN DATA AND INFORMATION

What is the difference between a database and a spreadsheet?

A.1.2 OUTLINE THE DIFFERENCES BETWEEN AN INFORMATION SYSTEM AND A

A.1.4 TRANSACTIONS, STATES AND UPDATES

even in the event of errors or system failures.

A.1.5 WHAT IS A DATABASE TRANSACTION?

A.1.7 EXPLAIN THE IMPORTANCE

Maintain integrity constraints defined on the database schema.

A.1.7 QUERIES AND UPDATES

SELECT STATEMENTS EXAMPLES

SELECT CustomerName,City FROM Customers;

SELECT DISTINCT Country FROM Customers;

SELECT * FROM Customers

SELECT * FROM Products

SELECT * FROM Customers

INSERT INTO Customers(CustomerName,City,Country)

Insert Multiple Rows:

The UPDATE statement is used to modify the existing records in a table.

A.1.9 DATA VALIDATION AND DATA VERIFICATION

Double Data entry (Reset

UNIT A.2 THE RELATIONAL DATABASE MODEL

A.2.1 | DATABASE MANAGEMENT SYSTEMS (DBMS) A database management system

RELATIONAL DATABASE MANAGEMENT SYSTEM (RDBMS)

A.2.2 OUTLINE THE FUNCTIONS AND TOOLS OF A DBMS:

A.2.5 CHARACTERISTICS OF SCHEMA

Physical / Internal Level(or) Schema

 The internal schema is the lowest level of data abstraction

Logical /Conceptual Schema(OR)Level

 Defines all database entities, their attributes, and their relationships

A.2.6 DATA DICTIONARY

A.2.8 DATA MODELING

● Collaboration: A data model helps to facilitate collaboration between developers, database

● Maintainability: A data model helps to improve the maintainability of a database. A data

A.2.9 | DATABASE TERMINOLOGY

● Composite Primary Key

WHAT IS AN INNER JOIN

A.2.10 | ENTITY RELATIONSHIP DIAGRAMS

An attribute is a characteristic or property of an entity, and is represented by an oval or ellipse.

Single Valued Attributes-

Multi Valued Attributes-

Cardinality ratios and participation

A.2.12 REFERENTIAL INTEGRITY

● 3rd Normal Form (3NF):

A.2.14 | CHARACTERISTICS OF A NORMALISED DATABASE

● Minimal Data Redundancy: A normalised database minimises data redundancy by organising

A.2.15| DATA TYPES

CONSTRUCT AN ER DIAGRAM FOR THE ABOVE SCENARIO:

SELECT AVG(order_size ), SUM(sales) FROM orders

A.2.19 | SIMPLE QUERIES VS COMPLEX QUERIES

Some key differences between simple and complex queries:

A.2.20 | CREATING A QUERY

UNIT A.3 FURTHER ASPECTS OF DATABASE MANAGEMENT

A.3.1 | THE ROLE OF THE DATABASE ADMINISTRATOR

Monitoring and DBA User

Monitoring and Data modelling

A.3.2 | HOW END USERS INTERACT WITH THE DATABASE