ICS 431-Ch11-Mass Storage Structure
ICS 431-Ch11-Mass Storage Structure
ICS 431-Ch11-Mass Storage Structure
Weeks 12-13
Remote
Remote secondary storage Storage on
L5:
(Distributed File Systems, Web Servers) network
servers.
4
Dr. Tarek Helmy, KFUPM-ICS
Sold-State Disks
5
Dr. Tarek Helmy, KFUPM-ICS
What is inside a Hard Disk Drive?
Actuator
Tracks
surface
Spindle
R/W Head
SCSI
Sector
connector
6
Dr. Tarek Helmy, KFUPM-ICS
Hard Disk Structure
Tracks: Multiple concentric circles
containing data.
Sectors: Subsections of a track.
Blocks: Subsection of a sector created
during formatting (512-4096 B).
• A: Track,
Cylinder: All tracks of same diameter in • C: Track Sector
the disk pack. • B: Sector,
Disk Data Layout • D: Cluster
Moveable heads: One r/w head per
surface.
Seek time: Time to get r/w head to the
desired track.
Head crash: Results from disk head
making contact with the disk surface.
Rotation/latency delay: Waiting time for
the desired data to spin under R/W
head.
Block transfer time: Time to read or
write a block of data.
7
Dr. Tarek Helmy, KFUPM-ICS
Disk Performance Parameters
To read or write, the head must be positioned at the desired track and
at the beginning of the desired sector.
Seek time: Time it takes to position the head at the desired track.
Rotational delay/latency: The time it takes for the disk to rotate so that
the required sector is above the disk’s head .
– Drives rotate at 60 to 200 times per second.
• Transfer rate is the rate at which data flows between the drive and the
MM. Data transfer occurs as the sector moves under the head.
Access time = Seek time + Latency time (about 15-60ms) + Data
transfer time (about1-2ms).
Disk bandwidth is the total number of bytes transferred, divided by the
total time (between the first request for service and the completion of
the last transfer).
8
Dr. Tarek Helmy, KFUPM-ICS
Timing of a Disk I/O Transfer
This means having a fast access time and large disk bandwidth.
The goal is to minimize the seek time (Seek time seek distance)
The idea is to improve both access time and bandwidth by scheduling the
requests of the disk I/Os in a good order.
Whenever a process needs an I/O to or from the disk, it issues a system call to
the OS, where the request specifies:
Whether this operation is input or output.
What is the disk address to read from?
What is the memory address to write into?
What is the number of bytes to be transferred?
If the desired disk drive and the controller are available, the request can be
serviced immediately, otherwise the request will be queued for that drive.
10
Dr. Tarek Helmy, KFUPM-ICS
Disk’s Head Scheduling Algorithms
• Disk’s Head Scheduler: Examines the current contents of the disk’s queue and
decides which request to be served next.
• The head scheduler must consider 3 important factors:
1. Overall performance of the disk
2. Fairness in treating processes’ disk requests
3. Cost of executing scheduling algorithm
Several algorithms exist to schedule the disk I/O requests.
First-In First-Out (FIFO),
Shortest Seek Time First (SSTF)
SCAN: The Elevator Algorithm,
C-SCAN, and C-LOOK
• These algorithms will be used with electromechanical disks (HDD) where they have
spinning disks and movable heads.
• They will not be used with solid state disks or electronic disks.
• These algorithms will be evaluated based on the head movement distance that will
affect the seek time. We will illustrate that with the following example:
• Disk queue (0-199) cylinders with requests for I/O to blocks on cylinders 98, 183, 37,
122, 14, 124, 65, 67, with head at cylinder 53. 11
Dr. Tarek Helmy, KFUPM-ICS
First-In First-Out (FIFO)
Traveled Distance
45
45
85
+ 85
146
+ 146
85
+ 85
108
+ 108
110
+ 110
59
+ 59
2 + 2
= 640
12
Dr. Tarek Helmy, KFUPM-ICS
SSTF – Shortest Seek Time First
13
Dr. Tarek Helmy, KFUPM-ICS
SSTF – Shortest Seek Time First
Traveled Distance
12
12
2
+ 2
30
+ 30
23
+ 23
84
+ 84
24
+ 24
2 + 2
59
+ 59
= 236
Dr. Tarek Helmy, KFUPM-ICS 14
SCAN: The Elevator Algorithm
• The disk head moves (from the current position to the center of the disk) serving the
closest requests in that direction. When it runs out of requests in the direction it is
currently moving, it switches to the opposite (to the edge) direction doing the same.
Traveled Distance
16
16
+ 23
23
+ 14
14
+ 65
65
+ 2
2 + 31
31
+ 24
24
+ 2
2 + 59
59
= 236
Dr. Tarek Helmy, KFUPM-ICS 15
Some Modification to the Scan
• Reverse the direction immediately, without going all the way to the edge or to
the center of the disk.
Traveled Distance
16
16
23
+ 23
51
+ 51
2 + 2
31
+ 31
24
+ 24
2 + 2
59
+ 59
= 208
Dr. Tarek Helmy, KFUPM-ICS 16
SCAN: The Elevator Algorithm
This algorithm usually gives fair service to all requests, but in the worst case, it
can still lead to starvation.
While it is satisfying requests on one cylinder, other requests for the same
cylinder could arrive.
If enough requests for the same cylinder keep coming, the heads would stay at
that cylinder forever, starving all other requests.
This problem is easily avoided by limiting how long the heads will stay at any
cylinder. One simple scheme is to serve only the requests for the cylinder that
are already there when the head gets there.
New requests for that cylinder that arrive while existing requests are being
served will have to wait for the next pass.
17
Dr. Tarek Helmy, KFUPM-ICS
Circular-SCAN (C-SCAN or LOOK)
SSTF is common and has a natural appeal if the process needs high data transfer rate.
SCAN and C-SCAN perform better for systems that place a heavy load on the disk.
Suppose that the queue has just one or two requests, then all algorithms behave the
same as FCFS.
Selecting the best algorithm can be biased by the file-allocation method, contiguous,
linked, indexed.
In general, either SSTF or C-LOOK is a reasonable choice for the default/normal
processes.
Recommended actions:
The disk-scheduling algorithm should be written as a separate module of the OS,
allowing it to be replaced with a different algorithm if necessary or to be updated.
The location of directories and index blocks is also important. Opening a file needs
to search the directory on the disk and then reading the file needs to access the disk
again. Caching the directory and disk blocks in main memory can also help in reducing
the head movement, specially for reading operation. 20
Dr. Tarek Helmy, KFUPM-ICS
Disk Formatting
• Two formatting processes are required before we can write data to a HDD:
1. Low-level (Physical) formatting: usually performed at the factory
Marking out cylinders and tracks for a blank hard disk, and then
dividing tracks into multiple sectors.
Sequentially numbers the tracks and sectors on the disk,
Identifies each track and sector,
Disk is physically prepared to hold data.
21
Dr. Tarek Helmy, KFUPM-ICS
Disk Partitioning and MBR
22
Dr. Tarek Helmy, KFUPM-ICS
Disk Partitioning and the Seek Time
• If seek distances can be minimized then the overall performance will be improved and the
importance of the disk-head scheduling algorithm will be narrowed.
• Disks are typically partitioned to minimize the largest possible seek time.
– A partition is a collection of cylinders
– Each partition is logically a separate disk
• Swap-space (virtual memory): The disk space that is used as an extension to the main
memory.
• Swap-space can be stamped out of the normal file system or, more commonly, it can be in
a separate disk partition.
• Knowing that the transfer rate depends on the position of reading or writing, do you think
virtual memory should be selected from outer or inner partitions?
Partition A Partition B
23
Dr. Tarek Helmy, KFUPM-ICS
Mass Storage Performance
25
Dr. Tarek Helmy, KFUPM-ICS
Reliability improve by using ECC (Parity Bit)
27
Dr. Tarek Helmy, KFUPM-ICS
Disks Technology Trends Price per Megabyte of DRAM
28
Dr. Tarek Helmy, KFUPM-ICS
RAID Technology
• Because disks are getting small and cheap, so it’s easy to put lots of
disks (10s to 100s) in one box to increase the storage, performance,
and availability.
With parallel access to multiple disks, the OS can improve the transfer
rate.
• The RAID box with a RAID controller looks just like a SLED to the
computer.
29
Dr. Tarek Helmy, KFUPM-ICS
RAID Levels
30
Dr. Tarek Helmy, KFUPM-ICS
Improvement of Disk Performance
Data Striping:
– Bit-level striping: Strip the bits of each byte across multiple disks.
• No. of disks can be a multiple of 8 or divides of 8.
– Byte-level stripping: Strip the bytes across multiple disks.
– Block-level striping: Blocks of a file are striped across multiple disks;
with n disks, block i goes to disk (i mod n)+1
• Every disk participates in every access
– Number of I/O per second is the same as a single disk
– Number of data read/written per second is improved
• Provide high data-transfer rates.
– Throughput is increased through a larger effective block size and
through the ability to perform parallel I/O.
31
Dr. Tarek Helmy, KFUPM-ICS
Disk Striping (RAID-0)
1 2 3
OS disk
block
8 9 10 11
12 13 14 15 8 9 10 11 12 13 14 15 0 1 2 3
0 1 2 3
Bit-level striping with (Hamming ECC) for error correction & recovery.
All member disks participate in the execution of every I/O request.
(Gives high transfer rate but not I/O request rate).
Spindles and heads are all synchronized to the same position.
Requires smaller number of disks compared to RAID level 1.
It uses parity checks, associate a parity bit with each byte in the
memory, even parity or odd parity.
In RAID level 2, 3 parity bits are used to reconstruct the damaged byte.
35
Dr. Tarek Helmy, KFUPM-ICS
Raid Level 3
• Striping at the byte level and stores dedicated parity bits on a separate disk
drive
• Because RAID 3 combines parity and striping with stored parity bits on a
dedicated disk,
• The disks must spin in sync, so sequential read/write (R/W) operations
achieve good performance.
• A read accesses all the disks.
• A write accesses all disks plus the parity disk.
• On a disk failure, read data from the remaining disks plus parity disk to
compute the missing data.
• High throughput for transferring large amounts of data
Parity disk
Data Disks
36
Dr. Tarek Helmy, KFUPM-ICS
RAID Level 3: Parity Disk
10010011
11001101 P
10010011
...
logical record 1 1 1 1
0 1 0 1
Striped physical 0 0 0 0
records 1 0 1 0
0 1 0 1
0 1 0 1
1 0 1 1
1 1 1 1
37
Dr. Tarek Helmy, KFUPM-ICS
Raid Level 4
• Block-interleaved parity
– RAID 4 is very similar to RAID 3. The main difference is the way of sharing
data. They are divided into blocks and written on disks.
– One disk is a parity disk, keeps parity blocks
– Parity block at position X is the parity for all blocks whose position is X on
any of the data disks
– A read accesses only the data disk where the data is there.
– A write must update the data block and its parity block.
– Can recover from an error on only one disk.
– Note that with N disks we have N-1 data disks and only one parity disk, but
can still recover when one disk fails
– But write performance worse than with one disk (all writes must read and
then write the parity disk)
Parity Disk 38
Data Disks
Dr. Tarek Helmy, KFUPM-ICS
Raid Level 5
39
Dr. Tarek Helmy, KFUPM-ICS
Raid Level 6
• Level 5 with an extra parity bit
• Two different (P and Q) check blocks
– Each protection group has
• N-2 data blocks
• One parity block
• Another check block (not the same as parity)
• Can recover when two disks are lost
– Think of P as the sum and Q as the product of D blocks
– If two blocks are missing, solve equations to get both back
• More space overhead (only N-2 of N are data)
• More write overhead (must update both P and Q)
– P and Q still distributed like in RAID 5
Parity Disks
Data Disks 40
Dr. Tarek Helmy, KFUPM-ICS
Selecting a RAID Level
• If a disk fails, the time to rebuild its data can be significant and will
vary with the RAID level used.
• RAID level 0 is used in high-performance applications where data
loss is not critical.
• Rebuilding is easiest for RAID level 1. Simply copy the data from
another disk.
• RAID level 1 is popular for applications that require high
reliability with fast recovery.
• The combination of RAID levels 0 and 1 (RAID 0 + 1) is used for
applications where performance and reliability are very important, for
e.g. banking system’s databases.
• Due to RAID 1’s high space overhead, RAID level 5 is often preferred
for storing large volumes of data.
• RAID level 6 is not supported currently by many implementations, but
should offer better reliability than level 5.
41
Dr. Tarek Helmy, KFUPM-ICS
The End!!
Thank you
Any Questions?
42
Dr. Tarek Helmy, KFUPM-ICS