CH 1 Distributed Systems
CH 1 Distributed Systems
CH 1 Distributed Systems
Chapter One
Introduction
Mulugeta A.
1
What is a distributed system ?
A distributed system is:
A collection of autonomous/independent computing
elements that appears to its users as a single coherent
system.
This definition refers two features of distributed system
Computing elements of distributed system are independent
Users or applications perceive a single system
2
Collection of autonomous nodes
Autonomous computing elements also called nodes
Can be of either HW or SW
Behave independently of each other
Each node have its own notion of time
There is no global clock.
Leads to fundamental synchronization and coordination problems.
However, these nodes collaborate to achieve a common
goal
Realized by exchanging messages with each other
3
Coherent system
A single coherent system – users believe they are dealing
with a single system
The difference between components as well as the
communication between them are hidden from users
Users can interact in a uniform and consistent way regardless
of where and when interaction takes place
Examples
An end user cannot tell where a computation is taking place
Where data is exactly stored should be irrelevant to an application
If or not data has been replicated is completely hidden
Note:- Autonomous elements have to
collaborate(metebaber) to achieve this goal
4
Middleware
The middleware layer extends over multiple machines, and offers each
application the same interface.
Goal is to hide the heterogeneity of the underlying OS and HWs
What does it contain?
Commonly used components and functions that need not be implemented by
applications separately.
7
Network of Workstations
8
Goals of distributed systems
Should you build distributed systems just because you can?
Should you build distributed systems for problems that can be solved
by a single machine?
NO
There are 4 goals that should be met to make building a
distributed system worth the effort.
Resource Accessibility - Easy to access and share resource
Distribution Transparency - Hide the fact that resources are distributed
across the network
Openness - The system should offer services according to standard
rules that describe their syntax and semantics
Extensible: easy to add / replace components
Scalability - Size scalable, geographically scalable, administratively
scalable
9
Resource sharing/accessibility
An important goal of distributed system is to make it easy for
users(and applications) to access and share remote resources.
Resources can be virtually anything,
Typical examples, storage facilities, data, files, services, and networks
Benefits
Economic
It is cheaper to have a single high-end reliable storage facility be shared than
having to buy and maintain storage for each user separately.
Encourage collaboration and exchange of information
allowed geographically dispersed people work together by means of
groupware software such as
Collaborative editing, teleconferencing, and so on
Bit Torrent – allows users to share files across the Internet
Challenges of resource sharing is Security,
E.g. email spam, DOS attacks
10
Distribution Transparency
DS hide the fact that its processes and resources are
physically distributed across multiple computers
DS that is able to present itself to users and applications
as if it were only a single computer system is said to be
transparent
That is, invisible, to end users and applications.
The concept of transparency can be applied to several
aspects of a distributed system.
The most important ones are shown on the next slide.
11
Types of Transparency in a Distributed System
Transparency Description
Access Hide differences in data representation and how an object is accessed
Relocation Hide that an object may be moved to another location while in use
13
Openness
An open distributed system is essentially a system that offers
components that can easily be used by, or integrated into other systems.
It often consists of component that originate from somewhere as well
How?
Components adhere to standard rules that describe the syntax and semantics
of what services those components have to offer
Define services through Interfaces using interface definition language(IDL)
It captures only the syntax of services like function name, parameter, return
type etc
Semantics (what those services do) specified in natural language
14
Openness
By openness we want to achieve
Interoperability
Implementations from different manufacturers can work together by
merely relying on the standard rules
Portability
Applications from one distributed system can be executed on another
distributed system that implements the same interface
Extensibility
Easy to add or replaces components in the system
Flexibility
Easy to modify/customize the system/component to a specific need
Flexibility is achieved by separating policy from mechanism
15
Policies versus mechanisms
policies
Which operations do we allow downloaded code to perform?
Which QoS requirements do we adjust in the face of varying bandwidth?
What level of secrecy do we require for communication?
mechanisms
Support different levels of trust for mobile code (like Applets)
Provide adjustable QoS parameters per data stream
Offer different encryption algorithms
Observation
The stricter the separation between policy and mechanism, leads to many configuration
parameters and complex management.
Hard coding policies often simplifies management and reduces complexity at the price
of less flexibility.
Solution – Find the right balance
Practice shows, provide reasonable defaults for parameters
There is no obvious solution
16
Scalability
Scalability is the ability of a system, network, or process to handle a
growing amount of work in a capable manner(with no significant loss
of performance)
Measured in three dimensions
Size scalable
Can easily add more users or resources to the system
Geographically scalable
Can easily handle users and resources that may lie far apart
Administratively scalable
Can easily managed even if it spans many independent administrative
organizations
Observation
Most systems account only for size scalability
Often solved by : multiple powerful servers operating independently in
parallel
17
Size Scalability problems
When services are implemented/running by means of a single/few
tightly coupled servers in the distributed system.
The server, or group of servers, can simply become a bottleneck when it needs
to process an increasing number of requests.
Root causes for size scalability problems with centralized solutions
The computational capacity, limited by the CPUs
The storage capacity, including the transfer rate between CPUs and disks(I/O
transfer rate)
The network between the user and the centralized service
18
Size Scalability problems - Examples
19
Problems with geographical scalability
It is difficult to scale existing distributed systems that were designed for
local-area networks into WAN
Many distributed systems assume synchronous client-server
interactions:
Client sends request and waits for an answer.
Latency may easily prohibit this scheme.
WAN links are often inherently unreliable:
The effect is that solutions developed for local-area networks cannot always
work on wide-area system
Example, simply moving streaming video from LAN to WAN is bound to fail.
Lack of multipoint communication(like broadcast),
Unlike LAN, a simple search broadcast cannot be deployed.
Solution is to develop separate naming and directory services to find hosts and
resources
20
Problems with administrative scalability
Finally, a difficult and in many cases open, question is
How to scale a distributed system across multiple, independent
administrative domains?
The major problem that needs to be solved is that of
Conflicting policies with respect to resource usage (and thus payment),
management, and security
Examples
Grid computing: share expensive resources between different domains.
Exception: several peer-to-peer networks
File-sharing systems (based, e.g., on BitTorrent)
Peer-to-peer telephony (Skype)
Note: in such systems end users collaborate and not administrative entities.
21
Scaling techniques
There are basically three techniques for scaling:
1) Hiding communication latencies,
2) Partitioning and Distribution
3) Replication
22
Scaling techniques
1) Hiding communication latencies, it is applicable in
the case of geographical scalability.
The basic idea is try to avoid waiting for responses to
remote service request as much as possible.
Make use of asynchronous communication
Make request and do other useful work up until the response
turned in
Problem: not every application fits this model.
Reduce the overall communication between client and server
Move computations to client
See next slide
23
Scaling Techniques (1)- Hide latency
25
Scaling Techniques (2)- Distribution
27
Pitfalls when Developing Distributed Systems
False assumptions made by first time developer:
• The network is reliable.
• The network is secure.
• The network is homogeneous.
• The topology does not change.
• Latency is zero.
• Bandwidth is infinite.
• Transport cost is zero.
• There is one administrator.
• Note:-
• When one of these fail, it is difficult to mask unwanted behavior
• Most of these issues will not most likely show up in non-distributed
applications development
28
Types of distributed systems
Three types of distributed systems
High performance distributed computing systems
Distributed information systems
Distributed systems for pervasive computing
29
Distributed Computing Systems
Used for high performance computing tasks
Examples distributed computing systems
Cluster computing systems
Grid computing systems
Cloud computing systems
30
Cluster Computing Systems
In virtually all cases, cluster computing is used for parallel
programming in which a single (compute intensive) program is run
in parallel on multiple machines.
A typical cluster system consists of a collection of compute nodes
that are controlled and accessed by means of a single master node.
Nodes are connected through a LAN.
Nodes are essentially homogeneous (Same OS, near-identical hardware )
The master node typically handles
The allocation of nodes to a particular parallel program,
Maintains a batch queue of submitted jobs, and
Provides an interface for the users of the system.
31
Example cluster configuration
General configuration of Linux-based cluster called Beowulf
34
Con’t
Fabric layer
Provides interfaces to local resources at a specific site.
These interfaces are tailored to allow sharing of resources within a
virtual organization.
Typically, they will provide functions for querying the state and
capabilities of a resource, along with functions for actual resource
management (e.g., locking resources).
Connectivity layer
Consists of communication protocols for supporting grid transactions
that span the usage of multiple resources.
For example, protocols are needed to transfer data between resources,
or to simply access a resource from a remote location.
In addition, the connectivity layer will contain security protocols to
authenticate users and resources.
35
Con’t
Resource layer
Responsible for managing a single resource.
It uses the functions provided by the connectivity layer and calls
directly the interfaces made available by the fabric layer.
Collective layer
It deals with handling access to multiple resources and typically
consists of services for resource discovery, allocation and
scheduling of tasks onto multiple resources, data replication, and so
on.
Application layer
Consists of the applications that operate within a virtual organization
and which make use of the grid computing environment.
Note:
Typically the collective, connectivity, and resource layer form
36 the heart of what could be called a grid middleware layer
Cloud computing
Provides computing services and resources (hardware and
software) over a network/internet
Cloud computing is based upon the concept of utility computing
Customers shall pay only based on a pay-per-use model
It is characterized by an easily usable and accessible pool
of virtualized resources
It is scalable as users can get more resources if more work needs to
be done
37
Cloud computing
In practice, clouds are organized into four layers
Hardware – contains resources customers never get to see directly.
Infrastructure - Employs virtualization techniques to provide customers
virtual storage and computing resources
Platform – provides API for developing Apps, Storage
Application – Actual applications like suite of apps shipped with OSes
38
Cloud computing
Cloud-computing providers offer these layers to their customers
through various interfaces
Command-line, Tools Programming interface and web interface)
Cloud computing providers offer their services according to three
fundamental models
Infrastructure as a service (IaaS)
Covering hardware and infrastructure layer
Basic infrastructure like storage
Platform as a service (PaaS)
Covering the platform layer
Database, web servers
Software as a service (SaaS)
Covering application layer
Software that the clients need like text processor
39
Advantages of Cloud Computing
Trade capital expense for variable expense
Instead of having to invest heavily in data centers and servers before you know
how you’re going to use them,
You can pay only when you consume computing resources, and
Pay only for how much you consume.
Benefit from massive economies of scale
By using cloud computing, you can achieve a lower variable cost than you can get
on your own.
Because usage from hundreds of thousands of customers is aggregated in the cloud,
providers such as AWS can achieve higher economies of scale, which translates into lower
pay as-you-go prices.
Stop guessing capacity
Eliminate guessing on your infrastructure capacity needs.
When you make a capacity decision prior to deploying an application, you often end up
either sitting on expensive idle resources or dealing with limited capacity.
With cloud computing, these problems go away. You can access as much or as little
capacity as you need, and scale up and down as required with only a few minutes’ notice.
40
Con’t
Increase speed and agility
In a cloud computing environment, new IT resources are only a click away,
which means that you reduce the time to make those resources available to
your developers from weeks to just minutes.
This results in a dramatic increase in agility for the organization, since the cost and
time it takes to experiment and develop is significantly lower.
Stop spending money running and maintaining data centers
Focus on projects that differentiate your business, not the infrastructure.
Cloud computing lets you focus on your own customers, rather than on the heavy
lifting of racking, stacking, and powering servers.
Go global in minutes
Easily deploy your application in multiple regions around the world
with just a few clicks.
This means you can provide lower latency and a better experience for your
customers at minimal cost.
41
Issues of cloud computing
Cloud computing is becoming so popular and common
It allows organizations to outsource their IT infrastructure:
hardware and software
Certainly a serious alternative to maintaining huge local
infrastructures
However, it has certain issues to resolve
Provider lock-in,
Security and privacy issues, and
Dependency on the availability of services
42
Examples(Amazon Web Services Cloud
Platform)
Some of the services provided by Amazon(AWS): Read the details of these
Services(use Overview of Amazon Web Services for reference)
Software Development Kits
Analytics
Amazon Athena
Amazon EMR
Amazon CloudSearch
Amazon Elasticsearch Service
Amazon Kinesis
Application Integration
Amazon MQ
Amazon SQS
Amazon SNS
Amazon SWF
Blockchain
Amazon Managed Blockchain
43
Con’t
Business Applications
Alexa for Business
Amazon WorkDocs
Amazon WorkMail
Amazon Chime
Compute
Amazon EC2
Amazon EC2 Auto Scaling
Amazon Elastic Container Registry
Amazon Elastic Container Service
Amazon Elastic Container Service for Kubernetes
Amazon Lightsail
AWS Batch
AWS Elastic Beanstalk
AWS Fargate
AWS Lambda
AWS Serverless Application Repository
AWS Outposts
VMware Cloud on AWS
44
Con’t
Customer Engagement
Amazon Connect
Amazon SES
Database
Amazon Aurora
Amazon RDS
Amazon RDS on Vmware
Amazon DynamoDB
Amazon ElastiCache
Amazon Neptune
Amazon Quantum Ledger Database (QLDB)
Amazon Timestream
Desktop and App Streaming
Developer Tools
Game Tech
Internet of Things (IoT)
Machine Learning
45
Con’t
Management and Governance
Media Services
Migration and Transfer
Mobile
Networking and Content Delivery
Robotics
Satellite
Security, Identity, and Compliance
Storage
Amazon S3
Amazon Elastic Block Store
Amazon Elastic File System
Amazon FSx for Lustre
Amazon FSx for Windows File Server
Amazon S3 Glacier
AWS Storage Gateway
46
Distributed Information Systems
Enterprises might already have multiple networked information
systems
One that makes its services available to remote clients.
Example a University might have Registrar system, HRM system
etc….
However, integrating applications into enterprise wide
information system was painful
Middleware is the solution
Applications can be integrated at different levels
Database level
That results in Distributed Transaction Processing
Application level
Enterprise Application Integration (EAI)
47
Distributed transaction processing
Operations on a database are usually carried out in the form of
transactions.
A nested transaction is constructed from a number of sub-transactions
The sub transactions could run in parallel on different machines to gain
performance
48
Transaction Processing Monitor (TP monitor)
TP monitor is a middle ware
Its main task is to allow application to access multiple servers/databases
Clients combine requests for (different) servers; send that off; collect
responses, and present a coherent result to the user.
49
Enterprise Application Integration
For applications decoupled from the databases they were built upon,
facilities were needed to integrate applications independent from their
databases.
The Idea here is application’s component can directly communicate with one
another.
Inter application communication leads to different communication
models
Remote procedure call or Remote Method Invocation
Requests are sent through local procedure call, packaged as message,
processed and result returned as message from call.
The disadvantage is that both caller and callee must be up and running at the
time of communication
Message Oriented Middleware (MOM) (Publish/Subscribe model)
Messages are sent to logical contact point (published), and forwarded to
subscribed applications
50
Enterprise Application Integration
51
Distributed Pervasive System
The distributed systems discussed so far are largely characterized by their
stability: nodes are fixed and have a more or less permanent and high-quality
connection to a network. But
pervasive systems. As its name suggests, pervasive systems are intended to
naturally blend into our environment.
The separation between users and system components is much more blurred.
There is often no single dedicated interface, such as a screen/keyboard
combination.
Uses sensors and actuators
52
Types of pervasive systems
1. Ubiquitous computing systems
2. Mobile computing systems
3. Sensor networks
53
Ubiquitous computing systems
User will be continuously interacting with the system, often
not even being aware that interaction is taking place.
Poslad [2009] describes the core requirements for a ubiquitous
1. Distribution:- Devices are networked, distributed, and accessible in a
transparent manner
2. Interaction: Interaction between users and devices is highly unobtrusive
3. Context awareness: -The system is aware of a user’s context in order to
optimize interaction
4. Autonomy:-Devices operate autonomously without human intervention,
and are thus highly self-managed
5. Intelligence:-The system as a whole can handle a wide range of dynamic
actions and interactions computing system roughly as follows:
54
Mobile computing systems
A simple definition could be: Mobile Computing is using a
computer (of one kind or another) while on the move
Another definition could be: Mobile Computing is when a work
process is moved from a normal fixed position to a more dynamic
position
A third definition could be: Mobile Computing is when a work
process is carried out somewhere where it was not previously
possible
Mobile Computing is an umbrella term used to describe
technologies that enable people to access services anytime and
anywhere
Examples MANET
55
Sensor networks
Consists of tens to hundreds or thousands of relatively small
nodes, each equipped with one or more sensing devices.
In addition, nodes can often act as actuators
A typical example being the automatic activation of sprinklers when
a fire has been detected.
Many sensor networks use wireless communication, and the
nodes are often battery powered.
Their limited resources, restricted communication capabilities,
and constrained power consumption demand that efficiency is
high on the list of design criteria.
56
End of Chapter1
57