DLMBMMIIT01 Session5

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

LECTURER: MAX MUSTERMANN

INTERNET OF THINGS
TOPIC OUTLINE

What is Internet of Things? 1

Social and Business Relevance 2

Architectures of IoT 3

Communication Standards and Technologies 4

Data Storage and Processing 5

Fields of Application 6
UNIT 5

DATA STORAGE AND PROCESSING


STUDY GOALS

At the end of the lesson you will have learned …

- about NoSQL and MapReduce.


- about linked data, RDF and OWL.
- about complex event processing and machine
learning.
EXPLAIN SIMPLY

1. What is the challenge of distributed databases?

2. Explain the concept of linked data.

3. What is a microservice?
DATA STORAGE AND PROCESSING

- NoSQL Benefits & challanges


NoSQL & MapReduce
- MapReduce

Linked Data - Ontologies, RDF, OWL

- Microservices
Data Processing
- CEP + ML
NOSQL DATABASES

- Overcoming deadlocks and concurrency issues of RDBS


- Challenges: scalability, availability and consistency
- Scalability: Sharding
- Availability: Replication (Master/Slave or more)
- Consistency (read & write): locks or eventual consistency
- Different data models of NoSQL
- Document-oriented (e.g. MongoDB)
- Key-Value store (e.g. Cassandra)
- (Graphs, Timeseries)
MAP REDUCE I
OVERVIEW

- Algorithm for parallel processing of massive amounts of data.


- Parallel processing on multiple nodes instead of client/server
- Works with key-value pairs
- IoT use cases:
- Statistical functions: min, max, sum, mean
MAP REDUCE II
ALGORITHM

Source of the graphic: Yahoo Developer Network, 2018.


DATA STORAGE AND PROCESSING

- NoSQL Benefits & challanges


NoSQL & MapReduce
- MapReduce

Linked Data - Ontologies, RDF, OWL

- Microservices
Data Processing
- CEP + ML
LINKED DATA, RDF & OWL

- The heterogenous IoT landscape has the challenge of capturing


too much data with too little inter-operability.
- Ontologies provide computer readable semantic to the data.
- RDF is a data model to describe resources (devices and
sensors).
- Web Ontology Language (OWL) add constraints to RDF.
- Linked Data connects ontologies.
LINKED DATA IN DETAIL

1. Use URIs to name (identify) things.


2. Use HTTP URLs so that these things can be looked up
(interpreted, "dereferenced").
3. Provide useful information about what a name identifies when
it's looked up, using open standards such as RDF, SPARQL, etc.
4. Refer to other things using their HTTP URI-based names when
publishing data on the Web.
DATA STORAGE AND PROCESSING

- NoSQL Benefits & challanges


NoSQL & MapReduce
- MapReduce

Linked Data - Ontologies, RDF, OWL

- Microservices
Data Processing
- CEP + ML
MICROSERVICES

Microservices:
Monolith:
architectural pattern to decompose an application into
single server application including all modules
small software services (distributed system)

Disadvantages Advantages Disadvantages

easy to scale
hard to scale More complex and
(services individually)
communication
small and individual overhead
big releases
releases
COMPLEX EVENT PROCESSING I

CEP evaluates data


based on certain
rules and patterns in
real-time.

Source of the graphic: Adi, 2006.


COMPLEX EVENT PROCESSING II

Overload situations occur, if more events arrive as a single


computer can process.

Mitigation of overload:
- Buffering
- load shedding
- parallelization
MACHINE LEARNING

IoT sensors and devices are producing data. But the business
needs information to make decisions.

Despite statistical functions and CEP, machine learning can do:


- Supervised learning: classification and regression
- Unsupervised learning: clusters, anomaly detection (e.g.
predictive maintenance)
- Semi-supervised learning: works with labled and unlabled data
REVIEW

You have learned …

- about NoSQL and MapReduce.


- about linked data, RDF and OWL.
- about complex event processing and machine
learning.
SESSION 5

TRANSFER TASK
TRANSFER TASK

Group Work MQTT


Scenario: Smart Parking for 200 parking spots in our city. Every free/occupied MQTT
message takes 300 bytes transmission data, with 40 bytes of plain sensor data.
Average parking processes per spot is 55 per day.

1. Research online: Find an IoT data plan for cellular networks and calculate the yearly
cost of data transfer for the network data transmission.
2. Calculate the yearly storage cost in a cloud storage (e.g. AWS DynamoDB or S3) with a
cloud cost calculator (e.g. https://calculator.aws/).
3. Would you consider the data as big data? (Explain why)
4. Think about use cases for CEP rules that could be applied on the data.
5. Think about use cases for Machine Learning that could be applied on the data.
TRANSFER TASK

Please present your


results.
The results will be
discussed in
plenary.
LEARNING CONTROL

1. What are the main drawbacks of the relational


databases which emerge with the rapid growth of
data?
a) They can be hosted on a cluster of processors.
b) They are widely prone to deadlocks and other
concurrency issues.
c) They organize the data as a set of tables with columns
and rows.
d) all of these answers
LEARNING CONTROL

2. What is the main shortcoming of the Resource


Description Framework (RDF) data model?
a) RDFs do not provide ways to represent constraints.
b) Each resource is only identified by a Unique Resource
Identifier (URI).
c) RDFs are not capable of specifying how resources are
inter-related.
d) RDFs cannot hide the heterogeneity of the underlying IoT
devices from the applications that use them.
LEARNING CONTROL

3. What is the core idea behind load shedding in complex


event processing?
a) With load shedding, a large number of events are buffered before
being processed.
b) With load shedding, parallel CEP operators are capable of
meeting buffering limits.
c) With load shedding, the CEP operator simply discards events
that it is not capable of processing in time.
d) With load shedding the resources are underutilized when the
traffic intensity is low.
LIST OF SOURCES

Adi, A. (2006). Complex event processing. IBM Event-based Middleware & Solutions group. Haifa: IBM Haifa Labs.
Yahoo Developer Network (2018). Yahoo Hadoop Tutorial. Retrieved from https://developer.yahoo.com/hadoop/tutorial/module1.html.

You might also like