Data Dissemination

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 46

Data Dissemination

• Ongoing advances in communications


including the proliferation of internet,
development of mobile and wireless
networks, high bandwidth availability to
homes have led to development of a wide
range of new-information centered
applications.
• Many of these applications involve data
dissemination, i.e. delivery of data from a
set of producers to a larger set of
consumers.
• Data dissemination entails distributing
and pushing data generated by a set of
computing systems or broadcasting data
from audio, video, and data services.
• The output data is sent to the mobile
devices. A mobile device can select, tune
and cache the required data items,
which can be used for application
programs.
• Efficient utilization of wireless bandwidth
and battery power are two of the most
important problems facing software
designed for mobile computing.
• Broadcast channels are attractive in
tackling these two problems in wireless
data dissemination.
• Data disseminated through broadcast channels
can be simultaneously accessed by an
arbitrary number of mobile users, thus
increasing the efficiency of bandwidth usage.
Communications Asymmetry
• One key aspect of dissemination-based
applications is their inherent communications
asymmetry.
• That is, the communication capacity or data volume in
the downstream direction (from servers-to-clients) is
much greater than that in the upstream direction (from
clients-to-servers).
• Content delivery is an asymmetric process regardless of
whether it is performed over a symmetric channel such
as the internet or over an asymmetric one, such as
cable television (CATV) network.
• Techniques and system architectures that can efficiently
support asymmetric applications will therefore be a
requirement for future use.
• Mobile communication between a mobile device and a static
computer system is intrinsically asymmetric. A device is allocated a
limited bandwidth.
• This is because a large number of devices access the network.
Bandwidth in the downstream from the server to the device is much
larger than the one in the upstream from the device to the server.
• This is because mobile devices have limited power resources and
also due to the fact that faster data transmission rates for long
intervals of time need greater power dissipation from the devices.
• In GSM networks data transmission rates go up to a maximum of
14.4 kbps for both uplink and downlink.
• The communication is symmetric and this symmetry can be
maintained because GSM is only used for voice communication.
Data Dissemination

Communication asymmetry in uplink and downlink and participation of device


APIs and distributed computing systems when an application runs

The above figure shows communication asymmetry in uplink and downlink


in a mobile network. The participation of device APIs and distributed
computing systems in the running of an application is also shown.
Communication Asymmetry
• Intrinsically asymmetric Mobile
communication between the mobile
device and static computer system
• Device allocated a limited bandwidth
Because of a large number of devices
• Bandwidth in the downstream from the
server to device much larger than the
one in the upstream from the device to
server because mobile devices have
limited power resources
• Faster data transmission rates for long
intervals of time need greater power
dissipation from the devices
Uplink and downlink in a
mobile network
GSM networks data
transmission
• Rates go up to a maximum of 14.4 kbps
for both uplink and downlink
• Symmetric communication
• Only used for voice communication
i-mode for many applications
• Used for voice, multimedia transmission,
Internet access, voice communication
• Base station provides downlink 384 kbps
• Uplink from the devices restricted to 64 kbps
• Asymmetric communication
The characteristics
in wireless
signals
• Interference and time-dispersion
• Signal distortion and transmission errors at
the receiver end
• Lead to path loss and signal fading, which
cause data loss
• Greater access latency compared to wired
networks
The characteristics
in wireless
signals
• Data loss has to be taken care of by
repeat transmissions
• Transmission errors have to be corrected
• Taken care of by appending additional
bits, such as the forward error correction
bits
The characteristics in
Mobile
communication
• Mobile devices also have low storage
capacity (memory)
• Cannot hoard large databases
• Accessing the data online not only has a
latency period (is not instantaneous) but
also dissipates bandwidth resources of
the device
Broadcasting
• Corresponds to unidirectional (downlink
from the server to the devices)
• Unicast communication─ Unicast means
the transmission of data packets in a
computer network such that a single
destination receives the packets
Broadcasting or application
distribution
service
• This destination generally the one which
has subscribed to the service Mobile TV─
an example of unidirectional unicast
mode of broadcasting
• Each device receives broadcast data
packets from the service provider‘s
application– distribution system
Broadcasting or application
distribution service
• Application–distribution system
broadcasts data of text, audio, or video
services
A broadcasting architecture
Classification of Data-Delivery Mechanisms
• There are two fundamental information delivery methods
for wireless data applications: Point-to-Point access and
Broadcast.
• Compared with Point-to-Point access, broadcast is a
more attractive method. A single broadcast of a data
item can satisfy all the outstanding requests for that item
simultaneously.
• As such, broadcast can scale up to an arbitrary
number of users. There are three kinds of broadcast
models, namely push-based broadcast, On-demand
(or pull-based) broadcast, and hybrid broadcast.
• In push based broadcast, the server disseminates
information using a periodic/aperiodic broadcast
program (generally without any intervention of clients).
• In on demand broadcast, the server disseminates
information based on the outstanding requests submitted
by clients; In hybrid broadcast, push based broadcast
and on demand data deliveries are combined to
complement each other.
• In addition, mobile computers consume less battery
power on monitoring broadcast channels to receive data
than accessing data through point-to-point
communications.
• Data-delivery mechanisms can be classified into three
categories, namely, push-based mechanisms (publish-
subscribe mode), pull-based mechanisms (on-
mode), and hybrid mechanisms (hybrid mode).
demand
Classification of Data-Delivery
Mechanisms
• Push-based mechanisms (publish–
subscribe mode)
• Pull-based mechanisms (on-demand
mode)
• Hybrid mechanisms (hybrid mode)
Push-based Mechanisms
• The server pushes data records from a set of distributed computing systems.
• Examples are advertisers or generators of traffic congestion, weather reports,
stock quotes, and news reports.
• The following figure shows a push-based data-delivery mechanism in which a
server or computing system pushes the data records from a set of
distributed computing systems.
• The data records are pushed to mobile devices by broadcasting without any
demand.
• The push mode is also known as publish-subscribe mode in which the data is
pushed as per the subscription for a push service by a user.
• The subscribed query for a data record is taken as perpetual query till the
user unsubscribe to that service. Data can also be pushed without user
subscription.
Push-based data-delivery
mechanism
Push-based mechanisms function in the
following manner:
• A structure of data records to be pushed is selected. An algorithm
provides an adaptable multi-level mechanism that permits data
items to be pushed uniformly or non-uniformly after structuring them
according to their relative importance.
• Data is pushed at selected time intervals using an adaptive
algorithm. Pushing only once saves bandwidth. However, pushing at
periodic intervals is important because it provides the devices that
were disconnected at the time of previous push with a chance to
cache the data when it is pushed again.
• Bandwidths are adapted for downlink (for pushes) using an
algorithm. Usually higher bandwidth is allocated to records having
higher number of subscribers or to those with higher access
probabilities.
• A mechanism is also adopted to stop pushes when a device is
handed over to another cell.
• Advantages of Push based mechanisms:
–  Push-based mechanisms enable broadcast of data services to
multiple devices.
–  The server is not interrupted frequently by requests from
mobile devices.
–  These mechanisms also prevent server overload, which might
be caused by flooding of device requests
–  Also, the user even gets the data he would have otherwise
ignored such as traffic congestion, forthcoming weather reports
etc
• Disadvantages:
–  Push-based mechanisms disseminate of unsolicited,
irrelevant, or out-of-context data, which may cause
inconvenience to the user.
Pull based Mechanisms
• The user-device or computing system pulls the data records from
the service provider's application database server or from a set
of distributed computing systems.
• Examples are music album server, ring tones server, video clips
server, or bank account activity server.
• Records are pulled by the mobile devices on demand followed
by
the selective response from the server.
• Selective response means that server transmits data packets as
response selectively, for example, after client-authentication,
verification, or subscription account check. The pull mode is also
known as the on-demand mode.
• The following figure shows a pull-based data-delivery
mechanism in which a device pulls (demands) from a server or
computing system, the data records generated by a set of
distributed computing systems.

26
Pull-based mechanisms function in the
following manner:
1. The bandwidth used for the uplink channel depends upon the
number of pull requests.
2. A pull threshold is selected. This threshold limits the number of
pull requests in a given period of time. This controls the number
of server interruptions.
3. A mechanism is adopted to prevent the device from pulling from
a cell, which has handed over the concerned device to another
cell. On device handoff, the subscription is cancelled or passed on
to the new service provider cell

• In pull-based mechanisms the user-device receives data records


sent by server on demand only
• Advantages of Pull based mechanisms:
–  With pull-based mechanisms, no unsolicited or irrelevant data
arrives at the device and the relevant data is disseminated only
when the user asks for it.
–  Pull-based mechanisms are the best option when the server
has very little contention and is able to respond to many device
requests within expected time intervals.

• Disadvantages:
–  The server faces frequent interruptions and queues of
requests at the server may cause congestion in cases of sudden
rise in demand for certain data record.
–  In on-demand mode, another disadvantage is the energy and
bandwidth required for sending the requests for hot items and
temporal records
Hybrid Mechanisms
• A hybrid data-delivery mechanism integrates pushes and pulls. The
hybrid mechanism is also known as interleaved-push-and-pull (IPP)
mechanism.
• The devices use the back channel to send pull requests for records,
which are not regularly pushed by the front channel.
• The front channel uses algorithms modeled as broadcast disks and
sends the generated interleaved responses to the pull requests.
• The user device or computing system pulls as well receives the
pushes of the data records from the service provider's application
server or database server or from a set of distributed computing
systems.
• Best example would be a system for advertising and selling
music albums. The advertisements are pushed and the mobile
devices pull for buying the album.

230
Hybrid interleaved push-pull-based data-delivery mechanism 231
• The above figure shows a hybrid
interleaved, push-pull-based data-delivery
mechanism in which a device pulls
(demands) from a server and the server
interleaves the responses along with the
pushes of the data records generated by a
set of distributed computing systems.

232
Hybrid mechanisms function in the
following manner:
1. There are two channels, one for pushes by front channel and the other
for pulls by back channel.
2. Bandwidth is shared and adapted between the two channels depending
upon the number of active devices receiving data from the server and the
number of devices requesting data pulls from the server.
3. An algorithm can adaptively chop the slowest level of the scheduled
pushes successively The data records at lower level where the records are
assigned lower priorities can have long push intervals in a broadcasting
model.
Advantages of Hybrid mechanisms:
–  The number of server interruptions and queued requests are significantly
reduced.
Disadvantages:
–  IPP does not eliminate the typical server problems of too many
interruptions
and queued requests.
–  Another disadvantage is that adaptive chopping of the slowest level of
scheduled pushes. 233
Selective Tuning and Indexing
Techniques
• The purpose of pushing and adapting to a broadcast
model is to push records of greater interest with greater
frequency in order to reduce access time or average
access latency.
• A mobile device does not have sufficient energy to
continuously cache the broadcast records and hoard
them in its memory.
• A device has to dissipate more power if it gets each
pushed item and caches it.
• Therefore, it should be activated for listening and
caching only when it is going to receive the selected data
records or buckets of interest.
• During remaining time intervals, that is, when the
broadcast data buckets or records are not of its interest,
it switches to idle or power down mode.
234
• Selective tuning is a process by which client device selects only the
required pushed buckets or records, tunes to them, and caches
them.
• Tuning means getting ready for caching at those instants and
intervals when a selected record of interest broadcasts. Broadcast
data has a structure and overhead.
• Data broadcast from server, which is organized into buckets, is
interleaved. The server prefixes a directory, hash parameter (from
which the device finds the key), or index to the buckets.
• These prefixes form the basis of different methods of selective
tuning. Access time (taccess) is the time interval between pull
request from device and reception of response from
broadcasting or data pushing or responding server. Two
important factors affect taccess –
– (i) number and size of the records to be broadcast and
– (ii) directory- or cache-miss factor (if there is a miss then
the response from the server can be received only in subsequent
broadcast cycle or subsequent repeat broadcast in the cycle).
235
Directory Method
• One of the methods for selective tuning involves
broadcasting a directory as overhead at the beginning of
each broadcast cycle.
• If the interval between the start of the broadcast
cycles is T, then directory is broadcast at each
successive intervals of T.
• A directory can be provided which specifies when a
specific record or data item appears in data being
broadcasted.
• For example, a directory (at header of the cycle) consists
of directory start sign, 10, 20, 52, directory end sign.
• It means that after the directory end sign, the 10th,
20th and 52nd buckets contain the data items in
response to the device request. The device selectively
tunes to these buckets from the broadcast data.
236
• A device has to wait for directory consisting of start sign,
pointers for locating buckets or records, and end sign.
• Then it has to wait for the required bucket or record
before it can get tuned to it and, start caching it.
• Tuning time ttune is the time taken by the device for
selection of records.
• This includes the time lapse before the device starts
receiving data from the server. In other words, it is the
sum of three periods—time spent in listening to the
directory signs and pointers for the record in order to
select a bucket or record required by the device, waiting
for the buckets of interest while actively listening (getting
the incoming record wirelessly), and caching the
broadcast data record or bucket.

237
• The device selectively tunes to the
broadcast data to download the records of
interest.
• When a directory is broadcast along with the
data records, it minimizes ttune and
taccess.
• The device saves energy by remaining active
just for the periods of caching the directory
and the data buckets.
• For rest of the period (between directory end
sign and start of the required bucket), it
remains idle or performs application tasks.
Without the use of directory for tuning, ttune =
taccess and the device is not idle during any 238
Hash-Based Method
• Hash is a result of operations on a pair of key and record.
• Advantage of broadcasting a hash is that it contains a fewer bits
compared to key and record separately.
• The operations are done by a hashing function. From the server end
the hash is broadcasted and from the device end a key is extracted
by computations from the data in the record by operating the data
with a function called hash function (algorithm).
• This key is called hash key.
• Hash-based method entails that the hash for the hashing parameter
(hash key) is broadcasted.
• Each device receives it and tunes to the record as per the extracted
key.
• In this method, the records that are of interest to a device or those
required by it are cached from the broadcast cycle by first extracting
and identifying the hash key which provides the location of the
record.

239
This helps in tuning of the device. Hash-based method can be
described as follows:
1. A separate directory is not broadcast as overhead with each
broadcast cycle.
2. Each broadcast cycle has hash bits for the hash function H, a shift function
S, and the data that it holds. The function S specifies the location of the
record or remaining part of the record relative to the location of hash and,
thus, the time interval for wait before the record can be tuned and cached.
3. Assume that a broadcast cycle pushes the hashing parameters H(R ) [H and
í

S] and record R . The functions H and S help in tuning to the H(R ) and hence
í í

to R as follows—H gives a key which in turn gives the location of H(R ) in the
í í

broadcast data. In case H generates a key that does not provide the location
of H(R ) by itself, then the device computes the location from S after the
í

location of H(R ). That location has the sequential records R and the devices
í í

tunes to the records from these locations.


4. In case the device misses the record in first cycle, it tunes and
caches that in next or some other cycle.

240
Index-Based Method
• Indexing is another method for selective tuning. Indexes
temporarily map the location of the buckets.
• At each location, besides the bits for the bucket in
record of interest data, an offset value may also be
specified there.
• While an index maps to the absolute location from
the beginning of a broadcast cycle, an offset index is a
number which maps to the relative location after the end
of present bucket of interest.
• Offset means a value to be used by the device along
with the present location and calculate the wait period for
tuning to the next bucket. All buckets have an offset to
the beginning of the next indexed bucket or item.
241
• Indexing is a technique in which each data
bucket, record, or record block of interest is
assigned an index at the previous data bucket,
record, or record block of interest to enable
the device to tune and cache the bucket after
the wait as per the offset value.
• The server transmits this index at the beginning
of a broadcast cycle as well as with each bucket
corresponding to data of interest to the device.
• A disadvantage of using index is that it
extends the broadcast cycle and hence
increases taccess.
242
• The index I has several offsets and the bucket type and flag information. A
typical index may consist of the following:
1. Ioffset(1) which defines the offset to first bucket of nearest index.
2. Additional information about Tb, which is the time required for caching the bucket
bits in full after the device tunes to and starts caching the bucket. This enables
transmission of buckets of variable lengths.
3. Ioffset (next) which is the index offset of next bucket record of interest.
4. Ioffset(end) which is the index offset for the end of broadcast cycle and the start of next
cycle. This enables the device to look for next index I after the time interval as per
Ioffset(end). This also permits a broadcast cycle to consist of variable number of buckets.
5. Itype , which provides the specification of the type of contents of next bucket to be tuned,
that is, whether it has an index value or data.
6. A flag called dirty flag which contains the information whether the indexed buckets
defined by Ioffset(1) and Ioffset(next) are dirty or not. An indexed bucket being dirty
means that it has been rewritten at the server with new values. Therefore, the device
should invalidate the previous caches of these buckets and update them by tuning to
and caching them.

243
Distributed Index Based
Method
• Distributed index-based method is an improvement on
the (I, m) method.
• In this method, there is no need to repeat the
complete
index again and again.
• Instead of replicating the whole index m times, each
index segment in a bucket describes only the offset I'
of data items which immediately follow. Each index I is
partitioned into two parts—I' and I".
• I" consists of unrepeated k levels (sub-indexes),
which do not repeat and I' consists of top I repeated
levels (sub-indexes).
• Assume that a device misses I(includes I' and I' once)
transmitted at the beginning of the broadcast cycle. As I'
is repeated m - I times after this, it tunes to the pushes
lesser
by usinglevels.
I', The access latency is reduced as I' has 244
Flexible Indexing Method
• Assume that a broadcast cycle has number of data segments with each of
the segments having a variable set of records. For example, let n records,
Ro to Rn-1, be present in four data segments, R() to Ri-1, Ri to Rj-1 , Rj to
Rj-1 and Rk to Rn-1.
• Some possible index parameters are (i) Iseg,having just 2 bits for
the offset, to specify the location of a segment in a broadcast cycle,
(ii) Irec, having just 6 bits for the offset, to specify the location of a record
of interest within a segment of the broadcast cycle, (iii) Ib, having just 4
bits for the offset, to specify the location of a bucket of interest within a
record present in one of the segments of the broadcast cycle.
• Flexible indexing method provides dual use of the parameters (e.g.,
use of Iseg or Irec in an index segment to tune to the record or buckets of
interest) or multi-parameter indexing (e.g., use of Iseg, Irec, or Ib in an
index segment to tune to the bucket of interest).

245
• Assume that broadcast cycle has m sets of
records (called segments). A set of binary bits
defines the index parameter Iseg,. A local index
is then assigned to the specific record (or
bucket). Only local index (Irec or Ib) is used in
(Iloc, m) based data tuning which corresponds
to the case of flexible indexing method being
discussed. The number of bits in a local index is
much smaller than that required when each
record is assigned an index. Therefore, the
flexible indexing method proves to be beneficial.
246

You might also like