Urn - Isbn - 978 952 61 4462 7

Download as pdf or txt
Download as pdf or txt
You are on page 1of 278

PUBLICATIONS OF

THE UNIVERSITY OF EASTERN FINLAND

Dissertations in Forestry
and Natural Sciences

ULLA GAIN

Framing of cognitively
computed insights
Proofs of concept by cognitive services
Framing of cognitively computed insights
Proofs of concept by cognitive services
Gain Ulla

Framing of cognitively computed insights

Proofs of concept by cognitive services

Publications of the University of Eastern Finland


Dissertations in Forestry and Natural Sciences
No 462

University of Eastern Finland


Joensuu/Kuopio
2022
PunaMusta oy
Joensuu, 2022
Editors: Pertti Pasanen, Nina Hakulinen, Raine Kortet, Matti Tedre and
Jukka Tuomela
Sales: University of Eastern Finland Library
ISBN: 978-952-61-4461-0 (print)
ISBN: 978-952-61-4462-7 (PDF)
ISSNL: 1798-5668
ISSN: 1798-5668
ISSN: 1798-5676 (PDF)
Author’s address: Ulla Gain
University of Eastern Finland
School of Computing
P.O. Box 1627
70211 KUOPIO, FINLAND
email: [email protected]

Supervisors: Virpi Hotti, Ph.D.


University of Eastern Finland
School of Computing
P.O. Box 1627
70211 KUOPIO, FINLAND
email: [email protected]

Professor Pekka Toivanen, D.Sc. (Tech.)


University of Eastern Finland
School of Computing
P.O. Box 1627
70211 KUOPIO, FINLAND
email: [email protected]

Reviewers: Tuomo Kujala, Ph.D., Associate Professor


University of Jyväskylä, Cognitive Science
Faculty on Information Technology
40014 JYVÄSKYLÄN YLIOPISTO, FINLAND
email: [email protected]

Annika Wolff, Ph.D., Assistant Professor


Lappeenranta-Lahti University of Technology
School of Engineering Science
53850 LAPPEENRANTA, FINLAND
email: [email protected]
Opponent: Sanna Kumpulainen, Ph.D., Associate Professor
Tampere University, Tampere Research Center
for Information and Media
Faculty of Information Technology and
Communication Sciences
Kalevantie 4, 33100 TAMPERE, FINLAND
email: [email protected]
Gain, Ulla
Framing of cognitively computed insights. Proofs of concept by cognitive
services.
Kuopio: University of Eastern Finland, 2022
Publications of the University of Eastern Finland
Dissertation in Forestry and Natural Sciences 2022; 462
ISBN: 978-952-61-4461-0 (print)
ISSNL: 1798-5668
ISSN: 1798-5668
ISBN: 978-952-61-4462-7 (PDF)
ISSN: 1798-5676 (PDF)

ABSTRACT

In business practices, there is a need for new tools to manifest insights from
data. There is an ongoing research gap in assessing what benefits new tools
(e.g., cognitive services) offer, what they can automate, and how they affect
human behaviour. This dissertation aims to fill the gap with answers to the
following research questions: What are the correspondences between
human cognition and cognitive services?; How to amplify human cognition
within the cognitively computed insights?; and How can the impacts of the
cognitively computed data for organisations be assessed? Cognitively
computed insights are manifested either by cognitive services or automated
machine-learning (ML) frameworks.
The main results included 20 constructions: four abstractions, seven
experiments, three frameworks, and six mappings. The term "construction"
refers to a typed entity, the types of which are abstraction, experiment,
framework, and mapping. The main meanings of the typed constructions
are vocabulary based, and their meanings are illustrated in the dissertation
as follows: abstraction is relevant information used to highlight a specific
purpose; experiment is a purposive investigation to gain experience and
pieces of evidence; framework is a set of reusable elements used to guide

7
solution-focused development; and mapping is an assigned
correspondence between two entities to explain differences and similarities.
The constructions were established within existing frameworks, which helps
reveal interest in the focus of this thesis. The constructions were the result
of adapting several frameworks concerning automated machine learning
(e.g., Pycaret), brain models (3D Brain, the layered reference model of the
brain), business model canvases (e.g., the value proposition canvas),
McKinsey’s automation capabilities, personality traits and types (global
vectors for word representation, International Personality Item Pool), and
value propositions (e.g., Bøe-Lillegraven’s ambidexterity value chains).
Moreover, the following cognitive services exemplify the constructions: IBM
Personality Insights, IBM Tone Analyzer, IBM Natural Language
Understanding (previously Alchemy Language), IBM Retrieve and Rank, IBM
Tradeoff Analytics, IBM Discovery, IBM Visual Recognition, and Microsoft
Speaker Recognition.
First, correspondences between human cognition and cognitive services
were explored. Eight constructions (one abstraction, three experiments, one
framework, and three mappings) derived parts of the answer. There are
many human cognitive functions that are reasonable to classify into bigger
functional entities. Therefore, mapping human cognitive functions onto
groups of cognitive functions was performed before the cognitive services
were compared with the human cognitive functions. The abstraction to
cognitive functions functional hierarchy concerns the similarities between
cognitive services and human cognitive functions presented by the 3D Brain
model. One hundred thirty-seven human cognitive functions were studied
and compared to cognitive services: the IBM Tone Analyzer functionalities
were similar to 65 human cognitive functions; the IBM Visual Recognition
functionalities were similar to 27 human cognitive functions; and the
Microsoft Speaker Recognition functionalities were similar to 45 human
cognitive functions. A framework for the functional hierarchy of cognitive
functions was established to attain comparability and correspondences
between human cognitive functions and functions of the cognitive services.
Two mappings concerned six cognitive services (IBM Natural Language

8
Understanding, IBM Tone Analyzer, IBM Personality Insights, IBM Retrieve
and Rank, IBM Tradeoff Analytics, and IBM Discovery). First, eleven out of
McKinsey’s 18 automation capabilities concerning work activities were
mapped to cognitive services. Second, four processes out of 52 human
cognitive processes presented by the layered reference model of the brain
were used to replace three automation capabilities. Finally, four discovered
verbs (bind, facilitate, revise, and manifest) were encapsulated through the
experiment concerning rules of the capabilities provided using cognitive
services to facilitate human cognition. The workflow experiment for value
propositions concerned five cognitive services (IBM Alchemy Language, IBM
Natural Language Understanding, IBM Tone Analyzer, IBM Personality
Insights, and Microsoft Text Analytics). It is possible to derive outcomes from
text without human intervention. However, the International Personality
Item Pool framework was used in the experiment to generate the rules for
transforming personality traits into questionnaires. The main aim was to
find the ground truth concerning the value proposition of the cognitive
services to support human cognition. Awareness of both capabilities and
functionalities of the cognitive services can contribute to fulfilling the system
or software system requirements.
Second, the methods for amplifying human cognition within cognitively
computed insights were collated. Six constructions (one abstraction, three
experiments, and two mappings) derived parts of the answer. In the early
stages of the research, big data analytics were used instead of cognitive
services. Therefore, the first review concerned big data analytics. The review
results formed the basis for the abstraction of uncovering information
nuggets from heterogeneous data as part of competitive advantage that was
constructed to understand what can be calculated and what is worth
calculating, and why. Further, the key questions were introduced for data
milling to find indicators. One of the first cognitive services was the IBM
Personality Insights, which was used in two mappings: mapping between
principles of business analytics and personality insights created
transparency between business analytics measurements and refined
personality insights; mapping between personality traits and expected

9
experience helped to understand customer experience based on
personality traits. The IBM Personality Insights service is an API (application
programming interface) service based on one of the most famous word-
embedding algorithms: Glove (global vectors for word representation). The
experiment included a data dump of 20 web channels (e.g., Facebook
comments) containing 53,294 messages to develop transparency and
understanding concerning corpus-based insights by the IBM Personality
Insights service. However, it was impossible to explain explicitly and exactly
how the IBM Personality Insights service calculates the values of the traits.
In the next experiment, semantic roles (subject, action, object) were coded
from the General Data Protection Regulation (GDPR) by a human interpreter
and the IBM Natural Language Understanding service. Krippendorff’s alpha
value was 0.85, which indicates that the capabilities of the IBM Watson
Natural Language Understanding service can be used to amplify human
interpreters. Cognitive services not only manifest cognitively computed
insights. An automated machine learning frameworks (e.g., Pycaret) were
reviewed to discover their capabilities, especially within business
intelligence (BI) tools (e.g., Microsoft Power BI). An experiment concerning
the low-code autoML-augmented data pipeline revealed a lack of
interoperable low-code autoML frameworks within BI tools. Only Pycaret
was adaptable in Microsoft Power BI. The outcomes of the cognitive services
and automated machine learning frameworks can be used as building
blocks to construct more significant functional entities. Further, cognitive
services can enhance insights on a product, process, or service, and
therefore they can be targeted to meet the needs of stakeholders and
companies.
Third, ways to assess the impacts of the cognitively computed data for
organisations were gathered. Six constructions (two abstractions, one
experiment, two frameworks, and one mapping) derived parts of the
answer. Three constructions outlined data-driven performance
management: the experiment concerning information nuggets for
indicators emphasised the importance of key questions; the abstraction of
two-way transparency between selected data and principles emphasised

10
performance monitoring; and the mapping between business analytics and
ambidexterity value chains emphasised value proposition. Usually,
performance management is based on derived data. Both real-time and
inferred insights are required. Therefore, measurable advantages in
digitalisation are proposed to ensure that the framework of situational
information and the possibilities of the inferred insights is facilitated with
the abstraction of the utilisation mindset of cognitive service outcomes. Two
cognitive services (IBM Visual Recognition and Microsoft Speaker
Recognition) were used to build the framework of context-aware
information transparency and smart indicators. The framework supports
the industry in combining Industrial Internet-of-Things (IIoT) business
models and value propositions to match the intelligent insights of cognitive
solutions to business objectives.
Once the constructions were conducted to answer the research
questions, five research process phases (i.e., hype framing/landing,
functional framing, content framing, technical framing, and continuous
impact assessment) were established. The research process phases can be
adapted in proofs of concept. They can be used in organisations to establish
proofs of concept before adapting new building blocks or new technology.
From the framing can be observed how the need for human cognition
amplifications affects how the impacts can be assessed. In conclusion and
based on the results, the impact of well augmented, transparent, objective,
and actionable cognitive data usage with or without interventions is built in
these arguments’ nature, result in usage impact is self-evident. However, the
organisations must make their experiments to identify their
competitiveness and effectiveness by adopting objective insights with the
help of cognitive services. The outcomes of the cognitive services can be
used as building blocks to construct more significant functional entities.
Furthermore, the research can propose constructions to assess the impacts
of the cognitively computed insights.

Keywords: Cognitive computing, cognitively computed insights, proofs of


concept by cognitive services, cognition, assessment

11
12
ACKNOWLEDGEMENT

Based on the goal of cleaning up, processing, and digging into data to find
the relevant information nuggets, the hunger for insight has grown with the
development of technology to obtain something cognitively useful from
data. This journey has been long and absorbed into one researcher’s life and
has resulted in the curiosity to search for explanations and seek answers.
Now it is time to thank those who helped. I would like to thank Virpi Hotti,
PhD, and Pekka Toivanen, D.Sc. (Tech.), for their expert advice and
encouragement throughout this challenging project. I want to thank the
reviewers, Docent Tuomo Kujala and Assistant Professor Annika Wolf, for
their constructive advice that has helped improve this thesis. Thank you,
Assistant Professor Sanna Kumpulainen, for being my opponent. At the
same time, I would like to thank my family and friends for their
encouragement that came when I needed it; thank you for believing in me.

Kuopio, February 18th, 2022


Ulla Gain

13
14
LIST OF ORIGINAL PUBLICATIONS

This thesis is based on data presented in the following articles, referred to


by the Roman Numerals I–IX.

I. Hotti V, Gain U. (2013). Big Data Analytics for Professionals, Data-


milling for Laypeople. World Journal of Computer Application and
Technology, 1(2):51-57, Horizon Research Publishing.

II. Hotti V, Gain U. (2016). Exploitation and exploration underpin


business and insights underpin business analytics. Communications
in Computer and Information Science, 636:223-237, Springer, Cham.

III. Gain U, Hotti V, Lauronen H. (2017). Automation capabilities


challenge work activities cognitively. Futura, 36(2):25-35.

IV. Gain U, Hotti V. (2017). Tones and traits - experiments of text-based


extractions with cognitive services. Finnish Journal of EHealth and
EWelfare, 9(2-3):82-94.

V. Gain U. (2020). The cognitive function and the framework of the


functional hierarchy. Applied Computing and Informatics, Emerald
Publishing Limited, 16(1/2):81-116. DOI:
https://doi.org/10.1016/j.aci.2018.03.003

VI. Gain U, Koponen M., Hotti V. (2018). Behavioral interventions from


trait insights. Communications in Computer and Information
Science, 907:14-27, Springer, Cham.

15
VII. Gain U, Hotti V. (2020). Awareness of automation, data origins, and
processing stakeholders by parsing the General Data Protection
Regulation sanction-based articles. Electronic government. DOI:
10.1504/EG.2021.10034597

VIII. Gain U, Hotti V. (2021). Low-code autoML-augmented Data Pipeline


– A Review and Experiments. Journal of Physics: Conference Series
1828012015.

IX. Gain U. (2021). Applying Frameworks for Cognitive Services in IIoT.


Journal of Systems Science and Systems Engineering 30: 59-84,
Springer Nature, DOI: 10.1007/s11518-021-5480-x

16
AUTHOR’S CONTRIBUTION

The idea of framing the cognitively computed insights is the author's.


Framing combines the constructions and findings of the research papers (I–
IX). The contributions of the authors are described as paper based.
I. The research idea was to improve understanding of big data and the
possibilities of data analysis. Gain and Hotti participated equally in
the literature appraisal, analysis, and writing process and formed the
proposed constructions.
II. The research idea was based on the shared knowledge of the authors
(Gain and Hotti) that behaviour management is an important
management area in business. Therefore, the main aim of Paper II
was to encourage experiments around the behaviour-centric value
proposition based on objective insights. The possibilities (e.g.,
ambidexterity and consciousness) of cognitively computed insights,
especially using cognitive services, are illustrated by the recently
published IBM Personality Insights service that offers personality
traits and consumption preferences. Gain and Hotti participated
equally in the literature appraisal, analysis, and writing process and
formed the proposed constructions.
III. The research idea was based on the shared knowledge of the authors
Gain, Hotti, and Lauronen. Both McKinsey's automation capabilities
and Wang’s cognitive processes offer definitions that can be used to
classify cognitive services. The corresponding author (Gain) was the
main contributor to the utilisation mindset and formalising rules
based on the evaluated outcomes of the cognitive services. Hotti and
Lauronen participated in the literature appraisal, analysis, and
writing process.
IV. The research goal was to determine whether there is ground truth
behind the cognitive services, especially IBM Personality Insights. This
is based on the shared doubts of authors Gain and Hotti. The authors

17
doubted whether there is non-repudiated argumentation behind the
cognitive service extractions (e.g., tones and traits). Gain and Hotti
participated equally in the literature appraisal, analysis, and writing
process. They formed the proposed constructions, for example using
the semantic roles to exemplify the trait-based personality
questionnaire.
V. The author was the sole contributor to this publication. The research
goal was a deeper understanding of cognitive functions, which can
therefore help to better understanding cognitive services. The author
performed all the data analysis, interpretation of the results, and
writing.
VI. The research idea concerned the techniques behind cognitive
services, especially the IBM Personality Insights service, based on
experiments involving the cognitive service application programming
interface (API) adaptions by all authors (Gain, Koponen, and Hotti) of
the paper. There were and still are doubts as to whether text-based
trait insights are reliable. The corresponding author (Gain) was a
major contributor to the study of word-embedding techniques. The
second author (Koponen) implemented the Python program to
obtain the IBM Personality Insights service's API-based results and
participated in practical data analytics, graphical elements, and the
correlation matrix. All authors co-wrote the final manuscript.
VII. This research regarding automation, data origins, and processing in
the general data protection regulation sanction-based articles is
essential for compliance and needs to be part of proofs of concept in
cognitive services. The idea of the research is to supplement the
assessment aspect concerning cognitively computed insights. The
corresponding author (Gain) was a major contributor to the study of
indicative semantic roles. Both authors Gain and Hotti analysed the
IBM Watson Natural Language Understanding Text Analysis cognitive
service results. Both authors co-wrote the final manuscript.
VIII. The research idea was to review and experiment with AutoML
frameworks and provide insight-driven data pipelines where data is

18
ingested, unified, mastered, and enriched as a ground for reports,
dashboards, and applications based on the shared knowledge of
authors Gain and Hotti.
IX. The idea of applying frameworks for cognitive services in IIoT was the
author’s, and she was the only contributor to this publication. The
author performed all the data analysis, interpretation of the results,
and writing.
These publications are referred to as Papers I–IX throughout this thesis. The
above publications have been included at the end of this thesis with their
copyright holders’ permission. The permissions to re-publish also on each
cover sheet of the published papers.

19
20
TABLE OF CONTENTS

ABSTRACT ............................................................................................................... 7
ACKNOWLEDGEMENT ......................................................................................... 13
LIST OF ORIGINAL PUBLICATIONS ..................................................................... 15
AUTHOR’S CONTRIBUTION ................................................................................. 17
TABLE OF CONTENTS ........................................................................................... 21
1 INTRODUCTION .............................................................................................. 27
1.1 Research gap and research questions ...........................................................28
1.2 Types of constructions......................................................................................30
1.3 Structure of the dissertation ...........................................................................32

2 CONCEPTUAL CONTEXT ................................................................................. 33


2.1 Ambidexterity.....................................................................................................33
2.2 Cognitive computing .........................................................................................36
2.3 Insights ................................................................................................................45
2.4 Cognitive services ..............................................................................................48

3 SUMMARY OF PAPER-BASED CONSTRUCTIONS .......................................... 57


3.1 Big data analytics for professionals, data milling for laypeople (Paper I)57
3.2 Exploitation and exploration underpin business, and insights underpin
business analytics (Paper II) ...........................................................................58
3.3 Automation capabilities cognitively challenge work activities (Paper III) .60
3.4 Tones and traits: experiment of text-based extractions with cognitive
services (Paper IV) ............................................................................................64
3.5 The cognitive function and framework of the functional hierarchy (Paper V)
.............................................................................................................................65
3.6 Behavioural interventions from trait insights (Paper VI) .............................70
3.7 Awareness of automation, data origins, and processing stakeholders
through parsing the general data protection regulation sanction-based
articles (Paper VII).............................................................................................71
3.8 Low-code autoML-augmented data pipeline: a review and experiments
(Paper VIII) .........................................................................................................72
3.9 Applying frameworks for cognitive services in IIoT (Paper IX) ...................73

21
3.10 Research methods as adapted frameworks ...............................................73

4 CONCLUDING REMARKS ................................................................................ 77


4.1 Power of the research process phases ..........................................................81
4.2 Answers to the research questions ................................................................84
4.3 Practical implications ........................................................................................89
4.4 Future research issues .....................................................................................92

5 BIBLIOGRAPHY................................................................................................ 95
6 PAPERS ........................................................................................................... 105

22
LIST OF TABLES
Table 1. Mapping between business analytics principles and personality
insights (adapted from Table 2, Paper II: table content errors
have been fixed). ...................................................................... 59
Table 2. Cognitive services versus capabilities of cognitive services. 62
Table 3. Constructions and used cognitive services: PI = IBM Personality
Insights, TA = IBM Tone Analyzer, NLU = IBM Natural Language
Understanding, AL = Alchemy Language, RR = IBM Retrieve and
Rank, TAO = IBM Tradeoff Analytics, D = IBM Discovery, VR = IBM
Visual Recognition, and SR = Microsoft Speaker Recognition.77
Table 4. Context-related constructions: A = ambidexterity, CC= cognitive
computing, I = insights. ........................................................... 78
Table 5. Constructions and the main aim of utilisation as well as
research questions: P = paper, A = abstraction, E = experiment,
F = framework, M = mapping, Q1= cognitive capabilities, Q2=
human cognitive amplification, Q3= impact assessment. .... 84

23
24
LIST OF FIGURES

Figure 1. The proofs-of-concept cycles from data to insight. .............. 29


Figure 2. The transformation from data to information, knowledge, and
wisdom through cognition. ..................................................... 37
Figure 3. The architecture of cognitive computing, adapted from Kaufman
et al. (2015). .............................................................................. 43
Figure 4. The example of Peirce’s sign (adapted from Hiltunen, 2010). I
came out from the cottage to the terrace and looked left into
the woods. Then, my attention (i.e., vision) attaches to the view
presented in the triangle (I became frightened). .................. 46
Figure 5. The stump in the morning light. I calmed down as it did not
seem to be moving. I decided to go closer and found a stump.
................................................................................................... 47
Figure 6. Outcomes of the cognitive services. ...................................... 54
Figure 7. IBM Personality Insights: big-five personality dimensions and 30
facets (adapted from IBM, 2021a). ......................................... 54
Figure 8. IBM Personality Insights: 12 needs and five values (adapted
from IBM, 2021a). .................................................................... 55
Figure 9. IBM Personality Insights: 42 consumption preferences (adapted
from IBM, 2021a). .................................................................... 55
Figure 10. Utilisation mindset of cognitive computing (Paper III, Figure 4).
................................................................................................... 64
Figure 11. Framework for the functional hierarchy of cognitive functions
(adapted from Paper V). .......................................................... 66
Figure 12. Hierarchy of cognitive functions (adapted from Paper V, Figure
4). ............................................................................................... 68
Figure 13. Research process phases and research questions framed for
construction mappings............................................................ 82

25
LIST OF ABBREVIATIONS

3D Three dimensional
AI Artificial intelligence
API Application interface
AutoML Automated machine learning
BI Business intelligence
Glove Global vectors for word representation
IE Information extraction
IoT Internet of Things
IPIP International Personality Item Pool
DIKW Data, information, knowledge, wisdom model
LDA Latent Dirichlet allocation
NED Named entity disambiguation
NEL Named entity linking
NEN Named entity normalisation
NER Named entity recognition
NERD Named entity recognition and disambiguation
NLP Natural language processing
NDC Not defined cognitive
PLSI Probabilistic latent semantic indexing
Q/A Question answering
TF-IDF Term frequency inverse document frequency
UIMA Unstructured information management architecture
WDA Watson discovery advisor

26
1 INTRODUCTION

The evaluation of data and information is referred to as business intelligence


(BI), and it partly aims to judge performance. In the era of digitalisation, data
discovery-oriented platforms are the mainstream in BI. Thus, there is an
urgent need for a new generation of computational theories and tools to
assist business users in extracting useful information and insights from
structured and unstructured data. Understanding what tools may support
different parts of a data pipeline is essential for creating insights from data
and identifying common drawbacks of the employed tools: they often lack
adequate support for laypeople (i.e., non-experts in data systems) such as
selecting the correct parameters for setting up the tool. This disadvantage
relates to several literacy-related fields including data literacy, machine-
learning (ML) data science literacy, and general computing literacy.
One way to bring the insights closer to the experiments in business is to
use the same concepts. Further, ideas around organisational ambidexterity
(i.e., exploitation and exploration) have been adapted within automation
when organisations confront the problems of inadequate resources.
Cognitive services are integrated into smart things, and they bring
knowledge and a learning environment to BI in order to increase human
cognition. Smart things refer to things that participate on the internet of
everything (Langley et al., 2021). Those are constructed by different
combinations of signals, hard and software smart functionality that can
connect people, systems, processes, and assets, including monitoring,
control, optimisation, and autonomy. Notably, cognitive services can be
embedded in more extensive functionality and, in practice, produce their
functions in a discreet manner so that the end user is not even aware of
using the core capabilities of cognitive computing (Kelly, 2015b). For
example, the Talkspace online therapy service was constructed by adapting
the IBM Personality Insights Service as it manifests personality traits from

27
textual data (Talkspace, 2020; IBM, 2021a). Grammarly is a browser
extension that manifests tones from textual data (Grammarly, 2020),
research process phases, and research questions.

1.1 Research gap and research questions

In business practices, there is a need for new tools to gain insights from data.
However, there is an ongoing research gap in assessing what benefits the
new tools (e.g., cognitive services) offer, what they can automate, and how
they affect human behaviour. Deployments of new technologies are being
delayed due to a lack of understanding of process phases, from hype1 or
unknown items to their impact assessments.
Figure 1 presents the fundamental entities of this thesis (i.e., data,
technology, and human cognition). It illustrates proofs-of-concept cycles for
discovering the insights gained in this thesis. Correspondences between
human cognition and cognitive services (i.e., transparency) are strongly
related to the amplification of human cognition and its assessment since we
need to amplify or reject without correspondences. In other words, human
cognition needs to be understood to build objective correspondences with
data. Arguments need to be presented that construct transparency which
can be verified. Cognitively computed insights are manifested either by
cognitive services or automated ML frameworks.

1
hype refers a situation in which something is brought out to attract everyone's
interest (Cambridge University Press, 2021)

28
Figure 1. The proofs-of-concept cycles from data to insight.

Organisations must be exploratory and exploitative, which means that


continuously proofs of concept must be addressed for new ideas or hype
issues. Some ideas or issues are slowly ripening. Therefore, deployments of
new technologies are delayed because there is no natural adaption
framework for organisations, and their impact is not addressed by proofs of
concept. This dissertation aims to fill the gap with answers to the following
research questions: What are the correspondences between human
cognition and cognitive services?; How to amplify human cognition within
the cognitively computed insights?; and How can the impacts of the
cognitively computed data for organisations be assessed?

29
1.2 Types of constructions

The International Organization for Standardization's Online Browsing


Platform (ISO OBP), the Software and Systems Engineering Vocabulary
(SEVOCAB), and the Unified Compliance Framework Dictionary (UCF) are
three vocabularies that were used to obtain evidence for definitions. The
vocabularies were selected based on meaningful content, that is, they cover
terms used in software system domains and authority documents (e.g., laws
and standards). The IEEE Computer Society and ISO/IEC JTC 1/SC7 constitute
the SEVOCAB authors, and they collect terms concerning software system
domains. Partly, SEVOCAB contains terms of the ISO Online Browsing
Platform (ISO OBP), mainly a collection of standardised terms. The UFC
collects terms concerning authority documents2, and the terms are usually
general without domain specificity.
The term "construction" is associated with implementation (SEVOCAB), a
"process of writing, assembling, or generating assets" (ISO OBP, SEVOCAB),
a structure or "complex entity" that is "made of many parts" (UCF). Further,
an implementation requirement is defined as a “construction of a system or
system component” (SEVOCAB), and a construction element is a "constituent
of a construction entity with a characteristic function, form, or position" (ISO
OBP).
The term "construction" is meant to be a typed entity, the types of which
are abstraction, experiment, framework, and mapping. The term
"abstraction" is defined as a "[p]reoccupation with something to the
exclusion of all else" (UCF) and a "view of an object that focuses on the
information relevant to a particular purpose and ignores the remainder of
the information" (ISO OBP, SEVOCAB). The term "experiment" refers to
trying "something new, as in order to gain experience" (UCF), or it is a
"purposive investigation of a system through selective adjustment of

2
Authority documents refer to document types as follows: Statutes, regulations,
directives, principles, standards, guidelines, best practices, policies, and
procedures (UCF)

30
controllable conditions and allocation of resources" (ISO OBP). The term
"framework" is defined, for example, as a "real or conceptual structure used
as a support or guide for building something" (UCF), a "particular set of
beliefs or ideas referred to in order to describe a scenario or solve a
problem" (ISO OBP), or a "reusable design (models or code) that can be
refined (specialised) and extended to provide some portion of the overall
functionality of many applications" (SEVOCAB). The term "mapping" is an
"assigned correspondence between two things represented as a set of
ordered pairs" (SEVOCAB), "[a]ny mathematical condition relating each
argument (input value) to the corresponding output value" (UCF), or a "set
of values having defined correspondence with the quantities or values of
another set" (ISO OBP).
The main results of this dissertation are 20 constructions (Chapter 3):
seven experiments, four abstractions, three frameworks, and six mappings.
The main meanings of the typed constructions are vocabulary based. Their
meanings as used in this dissertation are as follows: abstraction refers to
relevant information used to highlight a specific purpose; experiment is a
purposive investigation in order to gain experience and evidence;
framework is a set of reusable elements used to guide solution-focused
development; and mapping is an assigned correspondence between two
entities in order to explain differences and similarities.
The constructions were established by adapting several frameworks
(Section 3.10). When the term “framework” is used to illustrate research
materials or methods, its meaning covers content analyses tools and
techniques, review guidelines, reliability and validity instructions, as well as
contextual ground materials concerning automated ML (e.g., Pycaret), brain
models (3D Brain, the layered reference model of the brain), business model
canvases (e.g., the value proposition canvas), McKinsey’s automation
capabilities, personality traits and types (Global vectors for word
representation, International Personality Item Pool), and value propositions
(e.g., Bøe-Lillegraven’s ambidexterity value chains).

31
1.3 Structure of the dissertation

This thesis is based on nine scientific papers. Paper I illustrates the


examination of information nuggets for better competitive advantage:
possibilities for deeper understanding can lead to reactions. Paper II clarifies
the meaning of objective insights such as those cognitively computed. Paper
III presents the capabilities of cognitive services. Paper IV clarifies utilisations
of the cognitive services with and without interventions. Paper V explicates
the correspondences between human cognition and cognitive services.
Paper VI addresses word embeddings concerning textual inputs. Paper VII
proposes a method for assessing whether cognitive services are useful in
manifesting indicative semantic roles. Paper VIII presents a pipeline from
raw data to insights, possibilities of low-code autoML cognitive supportive
insights, and deeper understanding of meaningful fields. Paper IX proposes
value additions in the form of questions and answers.
The thesis contains four chapters and nine publications, and it is
organised as follows. Chapter 1 introduces the research scope and research
questions. Chapter 2 presents the conceptual context and cognitive services,
the insights of which were researched. Chapter 3 summarises the
contributions of the research papers (Papers I–IX) that make up this thesis.
Finally, Chapter 4 concludes the thesis by presenting validity issues,
implications, and issues for future research.

32
2 CONCEPTUAL CONTEXT

As the thesis frames cognitively computed insights, the conceptual context


of the thesis and research issues are related with ambidexterity (Section 2.1),
cognitive computing (Section 2.2), insights in general (Section 2.3), as well as
insights computed by cognitive services (Section 2.4).

2.1 Ambidexterity

Ambidexterity refers to organisational capabilities that ensure cash flow and


investment in product development. The term “ambidexterity” has its origins
in the 1650s and originally referred to the ability to use both hands with
equal ease. It was also used in Medieval Latin with the meaning of double
dealing (Dictionary.com, 2010). Later, in 1976, Duncan (1976) introduced
organisational ambidexterity in business; after that, March (1991) adopted
this concept for exploitation and exploration; it illustrates tension in the
business model (Raisch and Birkinshaw, 2008). The importance of
ambidexterity is highlighted in the context of insight-related issues such as
strategic management, innovation, technology management, organisational
learning and adaptation, organisation theory, and organisational behaviour
(Simsek, 2009).
Ambidexterity of different kinds is used at various levels: structural at the
corporate level, contextual at the business-unit level, and sequential at the
project level (Chen, 2017). Sequential ambidexterity realigns organisational
structure where the focus is shifted temporally between exploitation and
exploration to change environmental conditions, strategies, or project-level
requirements (Chen, 2017; O’Really and Tushman, 2013). In structural
ambidexterity, exploitation and exploration are organised in separate units
and coordinated by top managers (O’Reilly and Tushman, 2004; Tushman
and O’Reilly, 1996) to use different business-unit strategies, structures, and
processes (Chen, 2017). This is compared with contextual ambidexterity, in

33
which the created organisational environment allows employees to freely
choose between exploration and exploitation, for example, at Google,
engineers can use 20% free time to explore their selected research projects
(Chen, 2017).
There is a positive relationship between ambidexterity and firm growth,
firm performance, and business-unit performance. Ambidexterity can be
explained as a transition of organisational units between two processes—
exploitation and exploration—and by a company’s desire to benefit from the
complementarities of the two processes (Zimmermann et al., 2015). As Katila
and Ahuja (2002) explained, the exploration of new capabilities (scope)
elaborates the knowledge base of the organisation as well as the existing
capabilities (depth), which is often needed in exploration.
Organisational duality or tension between exploitation and exploration in
organisational learning has been studied through various aspects, for
example, Baum et al. (2000) states that “exploitation refers to learning
gained via the local search, experiential refinement, and selection and reuse
of existing routines. Exploration refers to learning gained through the
processes of concerted variation, planned experimentation, and play.”
Incremental innovations are minor adaptions carried out to meet existing
customer needs. Radical innovations are fundamental changes made to
satisfy emergent customer needs (Raisch and Birkinshaw, 2008).
Incremental innovation can operate in an ambidextrous organisation, and
this requires “the fine-grained strategic schemata and the governance
principles of the explicit knowledge are the underlying levers that set the
dimensions for incremental innovations” (Laukkanen, 2012).
Ambidexterity involves new technologies in strategic management.
Burgelman’s internal ecology strategy model initiatives have an
organisational scope and thus increase current knowledge compared to new
initiatives that are not part of the scope and therefore require learning
(Burgelman, 1988; Raisch and Birkinshaw, 2008). Further, the ambidexterity
of ecosystems especially emphasises solutions which search for a balance
between exploitation and exploration in the firm ecosystem, for example in
Kauppila (2010). The creation of an ambidextrous organisation utilises both

34
exploration and exploitation partnerships. Wan et al.’s (2017) research
aimed to "identify the essential tensions regarding platform strategies, and”
and further to “analyse how to balance them within platform ecosystems."
Exploration- and exploitation-based learning, for example planned
experimentation and knowledge sharing, are essential to increase
organisational capabilities and innovations such as value-proposition-based
objective insights. Firms strategically use a wide range of technologies for
research and exploitation purposes (Bresciani et al., 2018). Regarding
cognitive computing, John Kelly (2015b) propounded that “there is not an
industry or discipline that this technology won't completely transform over
the next decade.” Furthermore, Lucas and Goh (2009) specified the
disruptive technology as follows: “the most important observation is that
management has to recognise the threats and opportunities of new
information and communications technologies and marshal capabilities for
change.” Moreover, for organisational exploitation and exploration, the
technical activities need to be linked to the need. In other words, they need
to be strategically integrated into the performance management of
organisations such as in measures and strategic goals (Simsek et al., 2009;
Burgelman, 1988).

35
2.2 Cognitive computing

Cognition refers to “mental actions or processes of acquiring knowledge and


understanding through thought, experience, and the senses” (ISO OBP).
Cognition “enables the new classes of products and services to sense,
reason and learn about their users and the world around them” because
“where code and data go, cognition can follow, cognition transforms how a
company operates” (Kelly, 2015a). Data are transformed into cognition
through the paths of the data, information, knowledge, wisdom (DIKW)
model (Wang, 2009; Baškarada et al., 2013). Data are transformed into
meaningful information through cognitive processing (Baškarada et al.,
2013). Knowledge consists of understood, organised, absorbed, and
memorised information; it is accumulated learning. Knowledge consists of
skills and experiences acquired by doing something. Wisdom is
accumulated knowledge. From a wide perspective, wisdom is a form of
cognitive understanding that evolves with age as skills and knowledge
accumulate (Takahashi and Overton, 2005). Characteristic of wisdom is an
ongoing balancing between knowing and doubting, concordant with the
balance theory of wisdom (Takahashi and Overton, 2005). Wang (2009)
defines cognitive information as all internal embodiments, for example
experience, knowledge, and skills, that is between “data (sensational inputs)”
and “action (behavioural outputs).”
Figure 2 represents the transformation from data to information,
knowledge, and wisdom through cognition. In
Figure 2, the smaller arrows represent the ongoing brain processes, and the
bigger loop with the arrows represents the ongoing interactive process from
external perceptions.

36
Figure 2. The transformation from data to information, knowledge, and
wisdom through cognition.

Evolution towards cognitive systems: question answering systems.


Question answering (Q/A) systems are composed of information retrieval,
natural language processing, information extraction, knowledge
representation, and reasoning (Maybury, 2004). Early Q/A systems used
databases to answer users’ questions. The questions were mapped as
computable database queries; systems were complex and required expert,
hands-on direction to maintain and solve the problems. The era of the
Worldwide Web has provided a new source and approach for Q/A systems.
In these open-domain question answering systems, an answer must be
found and extracted through text retrieval. Pasca (2007) introduced a new
model for answer retrieval which embodies the de facto paradigm for Q/A
as the following (Webber and Webb, 2010, 630–654): retrieve potentially
relevant documents; extract a potential answer; and return a top answer(s).
The era of cognitive computing has brought Q/A systems to a new level; it
has reformed the question-answering model and extended each analysis
phase. The Watson Jeopardy Q/A system is a predecessor of the Watson

37
Discovery Advisor (WDA). Hence, Watson Jeopardy served as a starting point
in the DeepQA project (Beller et al., 2016; Ferrucci et al., 2010). Ferrucci et
al. (2010) present WDA system principles as content acquisition, question
and topic analysis, question decomposition, hypothesis generation,
hypothesis and evidence scoring, synthesis, final confidence merging, and
ranking analysis. The content of each phase of the pipelines is provided in
Ferrucci et al. (2010).
Evolution towards cognitive systems: transform unstructured data into
understandable form. The ability to transform unstructured data into a
computable and understandable form was a major step because most
business information consists of unstructured data (Blumberg and Atre,
2003). Also, data expansion is an ongoing process; for example, the Internet
of Things (IoT) creates more data. A significant portion of unstructured data
consists of textual formats such as email bodies, documents, web pages,
and social media data. Academic research in information retrieval and
computational models such as a vector space, probabilistic retrieval, and
Boolean retrieval models laid the groundwork for the progress of search
engines (Salton, 1988). Part of the development was the result of
computational linguistics techniques such as statistical natural language
processing (NLP) for lexical acquisition, word sense disambiguation (WSD),
probabilistic context-free grammars, and speech tagging (Manning and
Schültze, 1999).
Evolution towards cognitive systems: information extraction. In the era of
big data, text-mining techniques have been enriched in several domains,
especially in information extraction. Information extraction techniques like
sentiment analysis are used to identify the polarity of a text, providing a
positive or negative tone. Further, sentiment analysis categorisation helps
to scale the sentiment in a text. Sentiment classification and categorisation
are problematic causes of subjectivity (Pang and Lee, 2008). Also, detecting
sarcasm and irony in a text is challenging (Peng et al., 2015) as is determining
the context of a text where a negative or positive sentimental tone is
detected. The information extraction (IE) technique called entity recognition
(NER), is also known as entity extraction and includes processes to specify

38
names, places, dates, and organisations. It identifies and classifies the
entities in the text. There are different ways to exploit NER results; for
example, they can be indexed, linked off. They can be used in an IoT of IE
relations because “an IoT of information extraction relations is the
associations between named entities” (Manning, 2017). They can attribute
the sentiments and used them for question answering when the answers
are named entities. (Manning, 2017.) The IoT devices and sensors serve as
the machine’s senses of the environment; they offer opportunities to
leverage value from the unstructured data in the IoT when combining
services such as IBM Watson with, for example, the cognitive IoT in fitness
and well-being (Sheth, 2016).
Cognitive computing differs from traditional programmed solutions.
Among the differences between traditionally programmed solutions, the
outcomes of which are pre-determined, and cognitively computed solutions
that are probabilistic is their technological capabilities. For example, the
system can learn, recognise patterns, and process natural language and
images (Kaltenrieder et al., 2015). The outcomes of cognitive computing are
probabilistic (Marchevsky et al., 2017). As part of artificial intelligence (AI),
cognitive computing composes a set of services that mimic human brain
processes (Chen et al., 2016) such as recognising the speaker, transforming
speech to text, and abstracting sentiment (for example, positive, negative,
or neutral) from a text such as an SMS message (MS, 2017b; MS, 2020a; MS,
2020b). Hoffenberg (2016) presents the difference between cognitive
computing and AI, where AI systems offer solutions. The cognitive system
provides information about choices (i.e., it helps the human decide).
Cognitive computing aggregates collective and computational intelligence;
furthermore, there is no distinction between disciplines (Kaltenrieder et al.,
2015).
Grounds of cognitive computing. Cognitive computing can be seen as an
umbrella term for diverse research fields, models, processes, and
algorithms such as machine learning (ML), neural networks, semantic and
natural language processing, information retrieval, knowledge
representation, and reasoning (Kaltenrieder et al., 2015; Gliozzo et al., 2013).

39
Cognitive computing combines extensive academic research in algorithms
and device architecture such as deep learning and platforms inspired by
natural neural networks (Nahmias et al., 2013). These services are run on
cloud platforms that combine neural networks and sophisticated algorithms
such as ML, deep learning, and natural language processing to reach
solutions with and without human intervention (ElBedwehy et al., 2014;
Williamson, 2017). Cognitive solution development is an iterative process
that concerns model development, analysis, and testing, where ground truth
data correspond to model accuracy (Kaufman et al., 2015). The general
foundation of cognitive computing consists of the model, hypothesis
generation, and continuous learning. Cognitive systems’ typical architecture
is presented in Figure 3 as follows (Kaufman et al., 2015):
• Presentation and visualisation. Presentation and visualisation services
present data that support a hypothesis. They also support a request
when the system requires more information to improve the confidence
of the hypotheses. For this purpose, the three main types of services are
narrative solutions, visualisation services (graphics, images, gestures, or
animation), and reporting services such as structured outputs.
• Corpora and other data sources. The knowledge base for a cognitive
system is the corpus (plural corpora), which needs to be defined for
building a machine-readable model of a specific domain. This system's
knowledge base is used for linguistic analysis, discovering relationships
or patterns, answering questions, and delivering insights. Therefore,
corpora data sources play an important role in the system
implementation, and they can be updated continuously or periodical.
Other data sources the system could need are ontologies, taxonomies,
catalogues, structured databases of the specific subject, and acquired
information such as images, videos, sensor data, and language-related
data voice and text.
• Processing services. Processing services transform external data
sources’ language text, video images, audio files, and sensor data into a
machine-learnable format.

40
• Analytic Services. Analytics collect techniques for presenting
characteristics or relationships in the dataset, for example regression
analysis. Standard analytic components can be deployed such as
descriptive, predictive, and prescriptive tasks performed by statistical
software packages. Advanced analytics includes statistics, data mining,
and ML.
• Feature extraction and deep learning. Feature extraction and deep
learning are used to collect techniques needed to transform data into a
form that captures essential properties.
• Natural language processing. Natural language processing (NLP) is used
to process unstructured text. This group of techniques aims to extract
meaning from text. The techniques include language identification,
tokenisation, lexical analysis, syntax and syntactic analysis, resolving of
structural ambiguity, disambiguation of word sense, and semantics.
• Data access, acquisition, metadata, and management services. For
better answers and recommendations for decisions, the data sources of
the cognitive systems may need to be updated to correspond to the
latest domain-specific information. When supplemental information
needs to be added to the data sources, it must be identified, acquired,
and further transformed to support ML. Data access performs the
required analysis by identifying the relevant data.
• Internal data sources. Internal data sources are the structured and
unstructured data of organisations.
• Infrastructure/deployment modalities. Infrastructure and deployment
modalities consist of the networking, hardware, and storage base for
cognitive applications. The major considerations for infrastructure
approaches are distributed data management and parallelism.
Distributed data management consists of workload and external data
resources. Management needs to provide scalability and flexibility for
managing large quantities of data. Parallelism can contribute to the cycle
of hypothesis generation and scoring to process multiple hypotheses
and scorings simultaneously.

41
• Hypothesis generation and scoring. The hypothesis requires evidence or
a knowledge-based explanation of a causal relationship. The cognitive
system searches for evidence such as experiences and data as well as
relationships between data elements that can support or refute a
hypothesis. The system can create multiple hypotheses that underpin
the data in the corpus. For example, the hypothesis can be constructed
based on a user question when the corpus has trained question/answer
pairs. The hypothesis is a construct based on given data, where system
search patterns are based on assumptions defined in the system. The
hypothesis scoring assigns a confidence level for the hypothesis.
• Machine learning. ML algorithms look for patterns. Pattern similarity
comparison of elements such as structure, values, or proximity
(closeness) of data can interpret the pattern compared to known
patterns. For basic learning, the approach can be used for pattern
detection. The choice depends on available data and the nature of the
problem to be solved. Therefore, the choice of ML algorithms for the
cognitive application follow the same principle. For example, supervised
learning is a good candidate if the source data and associations between
data elements for the problem exist and if the data patterns can be
identified from the data, allowing them to be further exploited. The other
approach examples are reinforced learning (system takes action and
learns by trial and error) and unsupervised learning (detects patterns,
knowledge about the relationships and associations of the patterns).

42
Figure 3. The architecture of cognitive computing, adapted from Kaufman
et al. (2015).

Cognitive computing application interfaces. The architecture of cognitive


systems provides cognitive service application interfaces (APIs) that can be
used as building blocks for the solutions. Cognitive computing companies
provide these services as APIs at the developer cloud base. For example,
Microsoft gathers these service features into five groups: vision, speech,
language, decision (MS, 2021), and one group with AI/ML systems such as
IBM Watson Text-to-Speech, IBM Watson Speech-to-Text, IBM Watson
Discovery, IBM Watson Knowledge Studio, and IBM Watson Natural
Language Understanding (IBM, 2021c). These services are iterative and
interactive (PAT Research, 2021).
Measurable responses help determine the acceptable level of response.
An artificial neural network does not provide user interpretable reasons for
output (i.e., the production rules; Louridas and Ebert, 2016), for example
object recognition in which the system infers the rules for identification
during the training phase (i.e., states are embedded in the neural network)

43
and through object recognition provides the user with the result (i.e.,
classified the images; Abadi et al., 2015).
Information extraction and use of models. Useful extractible information
from text-based data is performed using models of different kinds. The big
five model captures the personality traits that underpin theories of
personality (De Raad, 2000). However, the model lacks a commonly accepted
solution as it misses the best way to form concepts of and measurements
for each domain of the big five. The ability to reply to the big five in different
cultures and languages and the optimal way to explore subfactors of the big
five is also missing. This raises doubts as to whether significant factors
beyond the big five exist and whether the big five should be merged into
super factors (Johnson, 2017). The text models are, for example, the solution
for categorising the word, sentence, or document and the categorisation
problem (in other words, classifying new documents, sentences, and words).
The model uses different incorporations of the techniques and knowledge
areas such as NLP, ML, and statistics. Word-embedding models aim to find
semantically similar words to capture syntactic and semantic regularities
from the text. For example, the Google open-source tool and Word2Vec use
topic vectors in neural networks or matrix factorisation to learn the vector
representation of the word to predict the other words in a sentence. This
vector representation captures the structure of the text and is used for text
classification (Merret, 2015). Term frequency-inverse document frequency
(TF-IDF) calculates the frequency of a term in a document and its importance
relative to the corpus (Rajaraman and Ullman, 2011). Topic modelling uses
algorithms to find the main themes in large arrays of unstructured
collections of documents. Probabilistic latent semantic indexing (PLSI) is a
statistical technique that uses probability to model and co-occurrence data
(Hofman, 2013). The ML technique latent Dirichlet allocation (LDA) is a
developed version of PLSI (Blei, 2012); it identifies clusters of entities that
are related (Earley, 2015).

44
2.3 Insights

A standard definition of insight is a profound and unique knowledge about


an entity where knowledge is an “outcome of the assimilation of information
through learning,” and an entity is “anything perceivable or conceivable” (ISO
OBP). However, the term “insight” is not used in statements of authority
documents such as directives and regulations (UCF). An insight is connected
with something. In other words, insight is “an instance of apprehending the
true nature of a thing, through” accurate, deep, clear, and intuitive
understanding (Dictionary.com, 2021; Cambridge University Press, 2021;
Lexico.com, 2018). Synonyms for insight such as perception, apprehension,
intuition, and understanding (Dictionary.com, 2021) are based on a
subjective perspective. From the perspective of an individual, insight is
described as “an understanding about the motivational forces behind one's
actions, thoughts, or behaviour; self-knowledge” (Dictionary.com, 2021) or
“especially an understanding about the motives and reasons behind one's
actions” (Dictionary.com, 2021; Random House Value Publishing, 1996). In
particular, the meaning of insight interconnects terms such as motives,
behaviour, self-knowledge, and action.
When we look at the workflow from a sign to an insight (in other words,
how the thing matures as an insight), at first we receive (i.e., sense) a signal
(i.e., a sign). The signal does not have absolute value unless it is attached to
a meaningful context (De Saussure, 1983; Hiltunen, 2010; Baškarada and
Koronios, 2013). In semiotics, the sign is divided into an object,
representamen, and interpretant (Peirce, 1868; Hiltunen, 2010), where
Peirce's sign (i.e., object) is objective, and the interpretation (i.e.,
interpretant) is subjective since it depends on the receiver of the sign. The
representamen is objective and subjective, for example the words for
objects in different languages (Hiltunen, 2010).

45
Figure 4. The example of Peirce’s sign (adapted from Hiltunen, 2010). I
came out from the cottage to the terrace and looked left into the woods.
Then, my attention (i.e., vision) attaches to the view presented in the
triangle (I became frightened).

An individual’s abilities influence interpretation. “The interpretant is an


individual’s comprehension of and reaction to, the sign-object
association” (Baškarada and Koronios, 2013). Interpretation of the sign can
be influenced by cognitive bias and self-knowledge. In this example, the
bear’s characteristics, the observation, and regional information, for
example the number of bears, are interconnected with the sign in the
observer's mind. The reception of a signal is influenced by the individuals'
mental filter and experience (Ansoff, 1984; Hiltunen 2010). Also, signal
quality can affect interpretation, especially if the signal quality is bad, for
example when the signal light of the beacon does not appear sufficiently
far away in the fog. In Figure 4, the observation is based on visual
recognition in the declining evening sun, which highlights colours and
shadows differently (Figure 5).

46
Figure 5. The stump in the morning light. I calmed down as it did not seem
to be moving. I decided to go closer and found a stump.

Data is defined in semiotics as a “symbol or a set of symbols used to


represent something” (Baškarada and Koronios, 2013). Also, Stephen Few
(2015) states that data is a collection of facts; when a fact is true and useful
(i.e., it must inform, matter, and deserve a response), it is a signal; otherwise,
it is noise. Indeed, Peirce’s representamen of an object, in other words a
thing, is anything that takes our attention, for example the image, a pattern
in processed heterogeneous data such as one that repeats in processed big
data or missing value. In other words, how signals mature into an insight is
that at first, a signal (i.e., a sign) is received (i.e., sensed), any kind that takes
our attention, for example an image or pattern of the heterogeneous data
or missing value. Perception through the senses allows us to receive and
process data. Perception consists of “how something is regarded,
understood, or interpreted” as well as “intuitive understanding and insight”
(Lexico.com, 2018).
Insight adds value through learning. The prerequisite for learning in the
digital era consists of creating useful information patterns and combining
data sources. Therefore, the capability to recognise and adapt to convert
models has become a key learning task. Data is needed, but it does not need
to be known beforehand; instead, finding sources that fulfil the
requirements is essential. Artificial intelligence is used with data to “extract

47
meaning, determine better results based on continuous learning, and
enable real‐time decision making” (Iafrate, 2018).
Before the term “big data” was introduced, companies used basic
analytics (mainly by manually examining spreadsheets) to find insights and
trends (SAS, 2021). Organisational insight can be described with the help of
the metamodel of TOGAF (the open group architecture framework) (The
Open Group, 2021). Organisational insights are connected to motivation.
Drivers that expose factors (i.e., opportunities or constraints) motivate the
organisation unit to create goals; this is a driver (in other words, insight) that
addresses a traceable goal. An objective is a time-bounded benchmark (i.e.,
near and midterm measurement points) that defines progress towards a
goal. Besides, the measures set performance criteria for the objective.
Consequently, the course of action realises the goal; it is influenced by
business capability, and it influences the value stream (e.g., value
propositions).

2.4 Cognitive services

Cognitive service are capabilities for manifesting insights (i.e., meaningful


information) and event trigger actions based on sophisticated algorithms
and ML techniques. Cognitive services have become more prevalent in
everyday life. They can be embedded in more extensive functionalities and,
in practice, discreetly produce their functions so that the end user is not
even aware of using the core capabilities of cognitive computing (Kelly,
2015a). In this context, this thesis presents cognitive services that use
unstructured data, for example text input. These are focal in this study
because their usability has been assessed in the light of various research
foci, for example insightfulness. The outcomes (Figure 6) and short
descriptions of these cognitive services are presented in the following
sections.
IBM Personality Insights. Personality Insights was deprecated and will be
discontinued as of the 1 December 2021. Natural Language Understanding
replaces Personality Insights as part of the analytical workflow (IBM, 2021a).

48
The service uses an open-vocabulary approach to infer personality traits
from textual input of at least 100 words (The accuracy of the analysis can be
improved by providing more words) using three models: big five, needs, and
values. The service deploys the word-embedding technique GloVe (global
vectors for word representation) to transform the words of the input text
into a vector representation. Next, the ML algorithm calculates percentiles
and raw scores of the personality characteristics from word-vector
representations and consumptions preference scores from the personality
characteristics; it generates the scores for each personality trait and
optionally the consumption preferences. Normalised scores are computed
by “comparing the raw score for the author’s text with results from a sample
population” (IBM, 2021a). The service computes personality characteristics
as follows: five personality dimensions (openness, conscientiousness,
extraversion, agreeableness, and emotional range) and 30 facets (Figure 7),
12 needs and five values (Figure 8), as well as the consumption preferences
(Figure 9). Big five describes “how a person generally engages with the
word”; needs describe what a person “hope[s] to fulfil when [she] consider[s]
a product or service”; and values (or beliefs) “convey what is the most
important to [a person]” (IBM, 2021a). Explanations for both the high and
low values are provided for dimensions and facets of the personality (IBM,
2021a). However, explanations are only provided for the high scores of the
needs (IBM, 2021a) and values (IBM, 2021a). The service evaluates 12 needs
at a high level, and these needs describe the aspects of the product that
likely resonate with the author of the inputted text (IBM, 2021a). Values are
described as motivating factors that influence the author’s decision making
(IBM, 2021a). Consumption preferences are grouped in eight categories. IBM
describes each preference briefly (IBM, 2021a). The calculated score of
consumption preferences indicates the author’s likelihood to prefer the
various products, services, and activities based on text input. The results
include the percentiles and raw scores for the personality traits (i.e.,
dimensions and facets of big five, needs, values, and consumption

49
preferences3), where percentiles are normalised scores, and raw scores are
based solely on the text (IBM, 2021a).
IBM Tone Analyser analyses tones (emotional, social, and language tones)
in the text (IBM, 2021a). Tone scores (ranges are < 5%, 5–75%, and > 75%)
indicate the probability of the tones being in the text. The service produces
outcomes regarding the document at the sentence level. There are several
versions of the service (IBM, 2021a); for example, interface version number
2016-05-19 in the request results in all three tones as follows: emotional
tones (anger, disgust, fear, joy, and sadness), language tones (analytical,
confident, and tentative writing styles), and social tones (openness,
conscientiousness, extraversion, agreeableness) in the range of tones of the
big five personality model.
IBM Natural Language Understanding. Cognitive services include the IBM
Watson Natural Language Understanding service (IBM, 2018) and the IBM
Watson Natural Language Understanding Text Analysis service (IBM, 2021b).
The main difference between these versions is in the representations of the
semantic roles. The IBM Watson Natural Language Understanding service
underlines and annotates the semantic roles. In contrast, the IBM Watson
Natural Language Understanding Text Analysis service offers the tabulated
subject-action-object form. It analyses and extracts categories, concepts,
emotions, entities, keywords, metadata, relations, semantic roles, and
sentiment (IBM, 2021a). For example, a test example of the demonstration
included semantic roles extracted from the text—a subject, action, and
object—from the site address of the reference (IBM, 2021a). The previous
version of the IBM Natural Language Understanding service was the IBM
Alchemy Language service (IBM, 2017a), which classifies entities (such as
people, companies, organisations, cities, geographic features) from HTML-

3
The results of consumption preferences are obtained in eight categories having
42 preferences as follows (IBM, 2021a): 12 shopping preferences, 10 movie
preferences, Nine (9) movie references, Five (5) reading and learning preferences,
Three (3) health and activity preferences, One (1) entrepreneurship preference,
One (1) environmental concern preference, and One (1) volunteering preference.

50
text web-based content or extracts text from URL web addresses. IBM has
also introduced Alchemy APIs (the predecessor of Watson Natural Language
Understanding, Watson Discovery News); it includes named entity extraction
that can classify entities such as people, companies, organisations, cities,
and geographic features from HTML text or web-based content. It uses
“sophisticated statistical algorithms and natural language processing
technology” (Krug et al., 2014). The categories used and returned by the
Natural Language Understanding category feature are divided into five
levels; the meanings of the hierarchies are refined up to Level 5 (IBM, 2017d).
An entity extraction API supports linked data (quotation extraction) using
context-sensitive entity disambiguation techniques to identify the entity
(IBM, 2017a). Developers and domain experts need to specify a custom
model for the organisation-specific entity extraction purposes. They need to
identify a custom set of entity types in the content, for example data mining
for business-specific information unique to the business, industry need,
banks, and cancer research. The models are customised NLP models
consisting of custom annotator components that can be created in a Watson
Knowledge Studio, enabling the user to identify domain-specific mentions
and relations in the unstructured text (IBM, 2021a). NLP uses entity linking,
named entity linking (NEL), named entity disambiguation (NED), named
entity recognition and disambiguation (NERD), and named entity
normalisation (NEN) processes to determine the entities in text. Entity
linking uses a knowledge base to link the entity mentions to the
corresponding node in a knowledge base (Hachey, 2013). Normalisation of
words is performed with NER to transform a word into a unique
representation and compare the previous form with more than one
representative in canonical form. Normalisation is also used to treat, for
example dates, acronyms, numbers, abbreviations, and social network data
such as blogs and tweets to transform the data into a form more suitable
for machine processing and computation (Mosquera et al., 2012).
IBM Discovery. In order to crawl (explore content), convert, enrich, and
normalise data, the meaning of a document is automatically converted and
enriched by NLP metadata. Further data is indexed into a collection that can

51
be queried (IBM, 2021a). Discovery News explores information from the
news, where private, third-party, and public data are used; it extracts from
the input (such as a query with a company name) an outcome that provides
the most recent and relevant news and news-based sentiments from a
variety of news sources (IBM, 2021a)
IBM Retrieve and Rank. This service extracts ranked answers based on
the queries. The service uses the power of Apache Solr to retrieve
information from a collection of documents and the rank component to
create a ML model trained on the data (IBM, 2017b). The Retrieve and Rank
service migrated to the Discovery service in October 2017. Since then, IBM
has continued to improve ML, enhance information retrieval, and add AI
capabilities to the Discovery service (IBM, 2019).
IBM Tradeoff Analytics. This service identifies the best options based on
multiple criteria. Reaching a decision entails finding the best candidates, in
other words reducing the number of options to a smaller set of optimal
solutions. IBM Tradeoff Analytics introduces a visual-interactive approach to
facilitate coping with multi-objective problems. The service implements
Shneiderman's visual information seeking: "[o]verview first, zoom and filter,
then details-on-demand" (Shneiderman, 1996; IBM, 2017c). The user is
provided an overview map and can explore it using vertices as reference
points. To denote preferences, the user indicates them by filter sliders, and
solutions not favourable to the user are greyed out. The service provides
tooltips for more details and enables interactions that allow the user to
select the nearest neighbouring solutions to examine them on the map or
see additional views. The service implements Shneiderman's (1996) visual
information seeking (IBM, 2017c), visualisation of a Pareto frontier that
concerns Pareto optimality (Ehrgott, 2005; IBM, 2017c), a self-organising
map (Kohonen, 2001; IBM, 2017c), and a ColorBrewer qualitative colour
scale (IBM, 2017c). The service provides guidance based on the fundamental
idea that consequences need to be evident to the decision maker: decision
analytics are used as judgment aids (IBM, 2017c).
IBM Visual Recognition. This service analyses images for scenes and
objects using deep-learning algorithms (IBM, 2021a). The service has built-in

52
models for face (until April 2, 2018), general, explicit, food, and text. The
models are trained to identify the top-level categories: animals, people,
food, plants, and fruits (IBM, 2021a). The user can train custom models to
create classes for special cases (IBM, 2021a). For example, the user provides
positive image examples of the desired class and negative image examples
that do not fall into the desired category as training data to create new
models. The outcome of the service is scored results with scores ranging
from 0 to 1; the higher the score, the better the correlation. In other words,
the service recognises with varying accuracy in a random image what the
image represents if it is trained in that image subject area (IBM, 2021a).
Microsoft Speaker Recognition. This service performs verification and
identification of a speaker using identify functions. Speaker verification
depends on unique characteristics of the voice (in other words, the unique
voice signature) used to identify a person. Further, the speaker identification
function compares the audio input (i.e., voice) with the provided group of
speakers’ voices. If a match for the voice is found, the speaker’s identity is
returned. Finding a match for the voice requires pre-registration of the voice
signature of the speaker (MS, 2017b).

53
Figure 6. Outcomes of the cognitive services.

Figure 7. IBM
Personality Insights: big-five personality dimensions and 30
facets (adapted from IBM, 2021a).

54
Figure 8. IBM Personality Insights: 12 needs and five values (adapted from
IBM, 2021a).

Figure 9. IBM Personality Insights: 42 consumption preferences


(adapted from IBM, 2021a).

55
56
3 SUMMARY OF PAPER-BASED CONSTRUCTIONS

The following Sections 3.1–3.9 discuss nine research papers by presenting


typed constructions (i.e., abstractions, experiments, frameworks, and
mappings) based on the papers. The final section (3.10) describes the main
adapted frameworks behind the constructions.
The names of the constructions are in italics in paragraph headings. A
reference to paper-specific figures and tables is added in parentheses at the
end of the paragraphs. Paper-specific statements related to figures and
tables are used in the descriptions of the constructions. If the figure or table
reference is not at the end of the paragraph, then the reference concerns
the summary figure or table. Only figures or tables of the most important
constructions and errors in publications have been redrawn or copied
(including reference to the original).

3.1 Big data analytics for professionals, data milling for


laypeople (Paper I)

Abstraction of uncovering information nuggets from heterogeneous data as


part of competitive advantage. In this paper, the term “data milling” is used
to improve understanding of the big data phenomenon and the possibilities
of data analytics. Information nuggets from heterogeneous data reveal
different meanings to laypeople. Some nuggets can provide deeper
understanding, thus leading to a reaction or increased knowledge without
reaction. Laypeople's understanding is based on information nuggets
presented by descriptive statistics. Moreover, key questions (e.g., What can
be calculated and what is worth calculating?) result from inferential
statistics. Laypeople need to understand what has been calculated and why
(Paper I, Figure 3).

57
Example information nuggets for indicators. The construction type is an
experiment. The paper presents the following key questions regarding data
milling: What happened?; Why did it happen?; What is happening?; Why is it
happening?; What will happen?; Why will it happen?; What should be done?;
and Why should it be done? It has been found that when the indicators are
selected, heterogeneous data can be used to reveal information nuggets.
The idea is to seek new insights and generate ideas. Data milling can be used
in business intelligence and strategic management to achieve a better
competitive advantage. Further, descriptive statistics can be used to find
unfamiliar facts based on information nuggets; after the material is
modelled, inferential statistics can be used. As an example is discussed
concerning investments in coal power plants in Europe (Paper I, Figures 2
and 4).

3.2 Exploitation and exploration underpin business, and insights


underpin business analytics (Paper II)

Abstraction of two-way transparency between selected data and principles .


This paper presents that principles, metrics, and targets can be used as
principle-based benchmarks, and they need to be two-way transparent with
selected data, in other words from data to principles and principles to data.
The principles must contain identified mechanisms (i.e., metrics, the source
of which are datasets) that will be used to measure whether the principle
has been met (Paper II, Figure 1).
Mapping between business analytics and ambidexterity value chains is
used to clarify objective insight. It realises transparency between principles
and metrics such as business analytics with ambidexterity, value
propositions, and market performance indicators. The meaning of customer
and market business analytics presented by Marr (2016) is mapped within
exploitation and exploration and further into the value propositions and
market performance of the multi-dimensional conceptual framework

58
exploring and exploiting the value chain as presented by Bøe-Lillegraven
(2014; Paper II, Table 1).
Mapping between principles of business analytics and personality
insights. This process realises the transparency between cognitively related
processed data and principle-based metrics. The personality insights were
mapped within business analytics. Business analytics concern the
stakeholders, for example customers, employees, and shareholders.
Therefore, the questions in which personality insights are central are
exemplified (Table 1 contains defect repairs; Paper II, Table 2)

Table 1. Mapping between business analytics principles and personality


insights (adapted from Table 2, Paper II: table content errors have been
fixed).

Business analytics Trait-based question


Customer profitability … of the found money-making customers?
Product profitability … of the buyers of the found money-making
products?
” What are the personalities”

Value driver … of the most important stakeholders?


Non-customer … of the prospects?
Customer … of the customers?
engagement
Customer … of the customers?
segmentation
insights”

Customer acquisition … of the customers?


Marketing channel … of the customers per marketing channel?

Mapping between personality traits and expected experience. Through


these mappings, it is shown that these insights can further be modified into
value propositions. In other words, the mapping results are used to increase
understanding and strengthen the customer and user experiences (CX and
UX). Humans have intentional and unintentional experiences. Intentional
experiences can be either manifest or latent: manifest experiences are
apparent, and they are understood immediately; latent experiences can

59
evolve to be manifest, for example with the help of teaching. An
unintentional experience might be, for example, a concert that affects the
well-being of humans. Expected experiences can be adapted as follows:
manifest experiences are insightful; latent experiences are challenging; and
unintentional experiences are sensuous. When the traits have been mapped
within the different expected experiences, value propositions are behaviour
centric (Paper II, Table 10).

3.3 Automation capabilities cognitively challenge work activities


(Paper III)

Mapping between cognitive service capabilities and work activity


automation capabilities. The correspondence between work activity and
cognitive service capabilities aids in choosing a cognitive function with the
necessary features for the system (as part of the system building blocks).
McKinsey’s capabilities are based on work activities in 18 categories. Based
on the analysed content, a category is identified, and mapping of cognitive
service characteristics onto these characteristics of work-based activities is
performed based on the comparison. The outcomes of the cognitive services
are redefined using the following criteria: whether the outcomes or parts of
outcomes are predefined, whether the cognitive service searches and
retrieves information from the data sources, and whether unsupervised
learning algorithms are used. Automation capabilities of the cognitive
services are mapped. Five automation capabilities concerning information
processing require the specific mapping rules as presented. The outcomes
of the cognitive services redefine whether the outcomes or parts of
outcomes are classified, ranked, or scored (Paper III, Figures 1–3, Table 1).
Mapping between automation capabilities and human cognitive
processes to cognitive services facilitates understanding. Understanding is
clarified by combining automation capabilities with the cognitive process
and mapping it to the cognitive service. The cognition processes and
intelligence functions supplement the automation capabilities of the

60
information process (having been brought out in mappings). The
automation capabilities are supplemented with cognitive processes (Wang
et al., 2006; Wang 2015). These supplements are conducted based on the
content analysis, and their transparent mapping rules are presented. Two
automation capabilities—recognising known patterns/categories
(supervised learning) and social and emotional reasoning—are replaced
with cognitive processes. Otherwise, the automation capabilities are
supplemented by the cognitive process mapping rules based on the
outcomes of the cognitive service (Paper III, Table 2).
Rules of the capabilities provided using cognitive services to facilitate
human cognition. The construction type is an experiment. Illustrative actions
are cognitive services functionality described with illustrative activity action
words. Cognitive services are summarised in illustrative activity action
words. Illustrative action words such as bind, facilitate, revise, and manifest
can describe the outcome of cognitive services, in general, what we can do
with the outcomes of the cognitive services to help understand them. The
bind, facilitate, manifest, and revise verbs are assessed by the cognitive
services (Table 2).

61
Table 2. Cognitive services versus capabilities of cognitive services.

Cognitive service Capabilities of cognitive services


Bind Facilitate Revise Manifest
IBM Natural Language x x
Understanding
IBM Tone Analyzer x
IBM Personality Insights x
IBM Retrieve and Rank x
IBM Tradeoff Analytics x
IBM Discovery x

The verbs are used to illustrate capabilities of the cognitive services by


adaptations as follows: 1) IF the cognitive service help to solve the problem
or make decisions, THEN bind; 2) IF outcomes of the cognitive service are
the extracted ones, THEN facilitate; 3) IF outcomes of the cognitive service
help prepare something, THEN revise; 4) IF the cognitive service predicts or
summarises something, THEN manifest.
Abstraction of the utilisation mindset of cognitive service outcomes.
Mindset demonstrates actionable insights that must be used to do
something; it demonstrates insights that need to be further processed with
or without human intervention as well as how automation releases
resources (Figure 10). Utilisation mindset is the effect of the cognitive service
reached through analysis and inductive reasoning, and it demonstrates that
we must carry out actionable insights (i.e., a useful fact that must inform,
matter, and deserve a response). Therefore, the utilisation of the insights
terminates only through action. The mindset illustrates a continuum in a
process that occurs after the input and is aimed at changing the process or
situation (i.e., interventional), for example to prevent undesired
developments in the outcome. The bind action is automated to handle the
signals without human intervention. The inferred insights are either
revisable or questionable. Thus, the actionable insights, like signals, can be
processed as bound action. Further clarifications of the utilisation mindset
are as follows:

62
i. Insights need to be actionable; an actionable insight must carry out
something. Visualisation and analysis of the inductive reasoning
realises this dilemma and makes it obvious.
ii. Interventional insights are those with or without human
interventions. Insights that need to be further processed with or
without human intervention are inferred insights. If the insights show
revisable items, for example sentences with a certain style, then the
content is revised, and the cognitive service is used again. If the
insights have questionable items, then they will be supplemented by
retrieving knowledge. In other words, some outcomes are
interventional, meaning that human interventions are needed. When
this occurs, human cognition needs to be amplified to reach valid and
desirable insights.
iii. The utilisation mindset illustrates the impact of automation (e.g.,
cognitive services) by how it releases resources with or without
intervention; this mindset helps assess the impact. Usually, the
organisational resources are limited, especially when the organisation
explores and exploits simultaneously (O’Really and Tushman, 2013).
The utilisation mindset helps to assess the impact and create
opportunities to balance the challenge of resources in organisational
ambidexterity.

63
Figure 10. Utilisation mindset of cognitive computing (Paper III, Figure 4).

3.4 Tones and traits: experiment of text-based extractions with


cognitive services (Paper IV)

Workflow experiment for value propositions. Examples of cognitive services


that build workflow are presented. Experiments on workflow for value
propositions are constructed. These experiments are conducted with
cognitive services as building blocks for outcomes to determine arguments,
for example how text-based extractions can be explained. The comparative
analysis extracted IBM’s cognitive services Alchemy Language, Tone
Analyser, and Personality Insights (Paper IV, Figure 1).
Rules for transforming personality traits into questionnaires. The
construction type is an experiment. The results of the IBM Watson

64
Personality Insights explanation of the facets were used to transform the
explanations into the form of questions. The International Personality Item
Pool (IPIP, 2017) offers, for example, 50 personality questions and their
explanations. These IPIP items are used in personality tests, and items have
been used to implement the Personality Insights service (IBM, 2021a).
Consequently, the questions were created with the help of the IBM Watson
Natural Language Understanding service to formalise data editing and
minimise human bias. The IBM Natural Language Understanding service
derived the semantic roles (subject, action, and object) from the explanation
sentences of the traits. The semantic roles were further used to form the
trait-based personality questions using the rules, the specifications of which
are based on the denotations of the IBM Natural Language Understanding
service. The semantic roles (i.e., action, object) were used to exemplify the
trait-based personality questionnaire based on the rules (pp. 84). The
explanations of the facets are transformed into question format, and the
trait-based questions can be used in the interventions if the trait-based
questions are answered with scores from 0 to 100. The traits and their
definitions (or explanations) offer the starting point for the discussions
during the interventions (Paper IV, Figure 2).

3.5 The cognitive function and framework of the functional


hierarchy (Paper V)

Framework for the functional hierarchy of cognitive functions . Records from


3D Brain and multivocal literature were used to collect evidence that
supports evidence-based guidelines for determining what types of
computing functions can be considered cognitive. The constructed
framework underpins human cognitive functions and categorises cognitive
computing functions into a functional hierarchy through which the
functional similarities between cognitive service and human cognitive
functions are presented (Figure 11). The framework construction is focused
on the functions of neural systems associated with brain structures. The 3D

65
Brain mobile application illustrates and explains the brain structures with
associated cognitive functions. First, information regarding the brain
structures and associated cognitive functions were collected and drafted in
a graph. Second, the lobes of the brain and their structures were illustrated.
Then, the associated cognitive functions were linked to the structures of the
lobes. Third, links were constructed between structures by following the
description information. In the analysis of the contents, it was found that the
functions form the chain of processes for performing a task, for example
visual perception. It was also found that some processes, such as language
processes, need further description. Therefore, the structures and
associated functions of the brain presentation were supplemented.
Literature searches were used to search for the supplemented information;
these supplements are shown with descriptions of processes with
references and/or cited statements.

Figure 11. Framework for the functional hierarchy of cognitive functions


(adapted from Paper V).

Mapping human cognitive functions onto groups of cognitive functions.


Construction of groups of cognitive functions was performed to map human
cognitive functions onto groups of cognitive functions. Human cognition
involves multi-functional cooperation (Figure 2). Construction of the groups
of cognitive functions and analysis of the contents were performed to assess

66
data and construct categories. The brain functions were grouped according
to nine main processes. The groups were obtained from two sub-
assemblies: the intentional functions of the brain, and the paths of the
transition of the sensory stimuli to the cognitive processes. If the associated
cognitive function of brain structure occurred in more than one process, it
is listed and mapped in each participant process because human cognitive
function results from multi-functional cooperation (Paper V, Figure 2).
Abstraction to cognitive functions functional hierarchy . This addresses
the similarities between the functions in cognitive computing and human
cognitive functions and summarises the differences and similarities
between the functions of the cognitive services and applications that are not
defined as cognitive (NDC; Figure 12). These groups of cognitive functions
form a functional interactive hierarchy. The associated cognitive functions
are grouped adjacent to their respective sensory stimuli as visual, auditory,
motor, sensation, and homeostasis functions. Further, language, emotion,
and behaviour-related functions of multi-processes are presented. Memory-
related functions are collected into memory functions, and finally, higher
complex executive cognitive functions are included cognitive functions
(Figure 12).

67
Figure 12. Hierarchy of cognitive functions (adapted from Paper V,
Figure 4).

The functionality of some cognitive service examples with online


demonstrations can be evaluated by experiment to characterise the
cognitive computing functions. The McKinsey Global Institute analysis
(Exhibit 4, p. 37), for example, identified the social, cognitive, and physical
patterns of capabilities which are often required to support many activities.
Hence, chosen functionality examples of cognitive services and applications
that are NDC are examples of these capabilities. Their functionality was
described and mapped to the brain functions, and their similarities are
presented. The framework was used to classify and hierarchise cognitive
service functions. For comparison, NDC application functionalities were
mapped to brain function, and their similarities are presented. The results
are summarised based on content analysis, comparison, and quantification,
which presents the differences and similarities between cognitive
computing and NDC function. Human cognitive function also results from
multi-functional cooperation, for example the functionality of language

68
processes. Similarly, cognitive services contain many functions that produce
cognitive outputs: the similarities between human cognitive functions and
the functions in cognitive computing. In comparison, it has been found that
human cognitive functions are not compatible with cognitive service
functions because the extent of functionality is not the same. The functional
similarities are presented through the hierarchy between cognitive services
and human cognitive functions to illustrate what kind of functions are
cognitive in computing. Further, cognitive computing functions produce
cognition results with language, emotional, and behavioural interpretation
capabilities. The functions go beyond human abilities, for example using big
data’s retrieval quantity, speed, and variety, and they combine data from
different sources. However, the extent of functionalities, such as human
visual recognition regarding images that require imagination to interpret, is
beyond the cognitive services. Similarities can be found in visual, auditory,
motor, sensation, and homeostasis functions, language and emotion,
behaviour functions, memory functions, and cognitive functions (meaning
that they can be found in all groups of human cognitive functions). The
differences and similarities between the functions of the cognitive services
and NDC applications are summarised. NDCs represent applications with
rules and pre-determined processes for producing results. Comparisons
made through the hierarchy showed that both types of solutions produce
results in cognitive functions and that the results of computational functions
are cognitive. In general, the computational functions produced cognitive
function results since they were implemented to serve a specific purpose. In
other words, they were deployed to serve human-origin cognitive needs, for
example calculation. Further, differences were found in the cognitive
functions of functional hierarchy categories in the following hierarchies:
language, emotional and behavioural functions, memory functions (e.g.,
learning), and cognitive functions (e.g., interpretation of sensations).
Summarising the above three constructions, 137 human cognitive
functions were studied and compared to cognitive services. The IBM Tone
Analyzer functionalities were similar to 65 human cognitive functions. The
IBM Visual Recognition functionalities were similar to 27 human cognitive

69
functions. The Microsoft Speaker Recognition functionalities were similar to
45 human cognitive functions.

3.6 Behavioural interventions from trait insights (Paper VI)

Workflow experiment from tagged messages to personality insight. The


construction type is an experiment. The focus is on techniques that are
behind text-based traits. It is clarified whether the trait insights of the
Personality Insights service can be used in behavioural interventions and
how behavioural interventions can be amplified through text-based trait
insights. The experiment concerned a data dump of 20 web channels (e.g.,
Facebook comments) and contained 53,294 messages from which the IBM
Personality Insights service found transparency and understanding
concerning corpus-based insights. In an example of trait-based insights
using the authors' personality, the techniques for obtaining personality
characteristics and consumption preferences were researched via The IBM
Watson Personality Insights API. The effort was in realising transparency in
a way that outcomes are obvious. Traceability from data to insight by
technical means clarifies and finds proof, for example, for missing
arguments. The techniques behind text-based traits are described as
transparent and standardised measures and methods required for the
pipelines from raw data to traceable results. For example, for traceability,
open and standardised measures and methods are needed to amplify the
behavioural interventions from text-based trait insights. In tracing from data
to insight, the idea is to transform the text into computable form by
converting it into a structural form such as vectors. Examples of techniques
are provided. The meaning of the term “embedding” is exemplified by the
Glove algorithm and explained since the IBM Personality Insights service
uses this algorithm to infer text-based traits. Glove is very useful with NLP
tasks such as demonstrating semantic and syntactic regularities. This
example constitutes an effort to take one step toward clarifying and finding
open measures between raw data and results. In an analysis of the results,
some personality characteristics correlated statistically significantly with

70
consumption preferences. Differences were found in correlation coefficients
in two cases: the consumption preferences versus the raw scores of the
personality characteristics and the consumption preferences versus the
percentiles of the personality characteristics. However, it is unknown how
the words of the texts are mapped onto the personality characteristics.
There is no knowledge of how personality characteristics affect
consumption preferences. Therefore, Pearson correlation calculations were
used to illustrate the relationship, in other words the correlations between
the percentiles of the personality characteristics and the consumption
preferences. The calculations were realised using the rcorr function of the
Hmisc (Harrell Miscellaneous) package (Paper VI, Figure 3).

3.7 Awareness of automation, data origins, and processing


stakeholders through parsing the general data protection
regulation sanction-based articles (Paper VII)

Experiment of a method for use and assess cognitively computed semantic


roles. Authority documents such as the General Data Protection Regulation
(GDPR) contain difficult-to-read or wordy sentences. Therefore, it might be
difficult to learn even the semantic roles of the sentences: subjects, objects,
and actions. The IBM Watson Natural Language Understanding Text Analysis
service (IBM, 2021b) is used to parse sentences into semantic roles. Further,
the Atlas.ti content analysing tool is used to code stakeholder instances from
an Excel file; the columns are article, paragraph, subject, action, and object.
Atlas.ti (2020) automatically codes columns with names that help validate
potential stakeholders before they are automatically coded and to explore
co-occurrences. Finally, human interpretation results are compared with the
semantic roles the IBM Watson Natural Language Understanding Text
Analysis service manifests. Krippendorff’s alpha (Neuendorf, 2002) was used
to assess intercoder reliability; the value was 0.85, which refers to the
capabilities of the IBM Watson Natural Language Understanding service that
can be used to amplify human interpretations (Paper VII, Tables 1–8).

71
3.8 Low-code autoML-augmented data pipeline: a review and
experiments (Paper VIII)

Low-code autoML-augmented data pipeline experiment. Low-code autoML


frames used in business intelligence (BI) tools allow increased data
comprehension. An experiment explored a pipeline from raw data to obtain
a deeper understanding concerning meaningful fields. The PyCaret
framework was used in the experiment to automate data processing
workflows (also known as pipelines). The data source of a pipeline is
reusable for different perspectives by different reports, dashboards, and
applications. Low-code autoML-augmented is flexible because of its
adaptable building blocks such as libraries and visualisations. The pipeline
items as steps are as follows: 1) curated or unmanifestable data, 2)
explorable or learnable data, 3) schema-relatable or stand-alone data, 4)
model files or best model identifiers, 5) dashboard and reports, and 6) low-
code applications. Google Collaboratory is used for the PyCaret framework
to find the best model and most important features. The adjustable
parameters in the PyCaret framework are for ML and other common
processes. The main functions of the parameters are illustrated with a
perceivable verb in two groups: feature collection and feature values. The
verbs are presented to help and support memory. The verbs for feature
collection are reduce, bin, group, ignore, permutate, drop, combine, relate,
detect, and cluster; and the verbs for feature values are impute, type,
encode, unwanted, rescale, reshape, retarget, and replace. For the PyCaret
model setup, there are two mandatory parameters: data and a target
column. The model is then used in the Microsoft Power BI query editor in
Python to build the classifier and classify the dataset. The Microsoft Power
Platform contains Power BI and Power Apps to provide reports, dashboards,
and low-code applications. There are several visualisation possibilities in the
quick insights to AI visuals like key influencers and a decomposition tree. The
classified labels permit the use of the reports to deepen users

72
understanding of meaningful fields. “In general, the low-code autoML
frameworks are cognitive supportive when they manifest insights from
datasets” (Paper VIII, Figure 3).

3.9 Applying frameworks for cognitive services in IIoT (Paper IX)

Framework to achieve measurable advantages in digitalisation. Applying the


value proposition canvas for Industry 4.0 business model canvas (BMC) and
lean canvas make measurable issues more concrete; it clarifies traceability
in transparency to bring objective insights closer to business objectives, thus
bringing digitalisation advantages closer to the business objectives. The
technological advantages can be harnessed for business use with data-
driven strategies wherein cognitive services are integrated into smart things
(Paper IX, Figure 3).
Context-aware information transparency and smart indicators. An
example is provided of the ability of cognitive services to infuse the
knowledge of service and business objectives into the production line.
Context-aware information transparency augments awareness of the
situation and presents optimised recommendations for possible courses of
action based on situational analysis and predictions and integrates the
knowledge into a learning environment. Integrated BI systems help to get
an agile opinion. Data needs to be transformed into a form that can be
understood by a human. Therefore, an important focus on action options
needs to be aligned with measures according to the target values, which
constitute the definite goal (Paper IX, Figures 11 and 12 and Table 2).

3.10 Research methods as adapted frameworks

The constructions were established by adapting several frameworks.


Multivocal literature (Garousi et al., 2019) was used for framework selection.
The criteria descriptions for each framework selection are as follows:

73
• Literature review guidelines: Okoli and Schabram (2010), Kitchenham
and Charters (2007) (Paper I). Used to improve understanding of the
big data phenomenon and the possibilities of data analytics.
• Guidelines for validity evaluation of research results per Runeson et
al. (2012) was used in Papers I, III, and VII.
• In Paper II, a content analysis of Ahlemeyer-Stubbe and Coleman’s
(2014) research into practical data mining for business and industry
and Lanigan’s (1994) research for capta versus data resulted in the
need for two-way transparency between selected data and principles.
Key business analytics (Marr, 2016) and ambidexterity value chains
(Bøe-Lillegraven, 2014) were chosen to fill a research gap:
ambidexterity is combined within analytics, either business analytics
or insights based on cognitive computing. In addition, for behaviour-
centric value propositions, the mappings concerning the expected
experience based on Rauhala (1995 and 2009) and the personality
traits (IBM, 2021a) were used.
• In Paper III, a content analysis of automation capabilities (McKinsey
Global Institute, 2017) was chosen for capabilities. They are based on
the work activities needed to construct the capabilities of the cognitive
services, and through inductive reasoning, they are used for the
utilisation mindset to illustrate what can be done with the outcomes
of cognitive services.
• In Paper III, content analysis on the cognitive processes in the layered
reference model of the brain (LRMB) (Wang et al., 2006; Wang, 2015)
was selected and used to supplement McKinsey’s automation
capabilities.
• In Paper IV, the International Personality Item Pool (IPIP) (IPIP, 2017),
was found in the research gap related to the sought features. IPIP was
chosen to find arguments for the extractions in the workflow for value
propositions.
• Paper V is based on cognitive functions search. 3D Brain (Gold Spring
Harbor Laboratory, 2017) was chosen because it illustrates and
explains the brain structures with associated cognitive functions. The

74
study describes what types of computing functions could be
considered cognitive and helps identify, use, and classify the
properties and abilities of cognitive functions.
• In Paper VI, global vectors for word representation in the GloVe
algorithm (Pennington et al., 2014) were chosen for traceability from
text to a personality trait. Other word-embedding techniques were
researched to focus on amplifying behavioural interventions from
text-based trait insights.
• In Paper VII, a content analysis based on semantic roles in subject-
action-object from the IBM Watson Natural Language Understanding
Text Analysis service (IBM, 2021b), data categories (Mussalo et al.,
2018), processing stakeholders (EC, 2016), and Krippendorff’s alpha
were used to estimate the intercoder reliability between two coders
(the cognitive service and human interpreter) (Neuendorf, 2002).
• Paper VIII presents a review of the latest free and open-source
software ML frameworks based on Mardjan (2020). The review aims
to find outliers, classes, clusters, and precalculations as well as
usability by Google Colaboratory and Microsoft Power BI (Moez,
2020a). The Pycaret Framework (Moez, 2020b) was chosen, especially
the definitions of parameters adapted for model setup. A low-code
autoML framework with Google Colaboratory and BI tool usability
with pipeline steps were used to illustrate possibilities to construct
pipelines from raw data to different reports, dashboards, and
applications.
• In Paper IX, the business model canvas (Osterwalder, 2004), lean
canvas (Maurya, 2017), and the value proposition canvas
(Osterwalder, 2012) were adapted for cognitive services in IIoT to
achieve measurable digitalisation advantages and context-aware
information transparency and smart indicators.
• In Paper IX, a content analysis (DeFranco and Laplante, 2017) was
based on qualitative software engineering research (Paper IX Glass et
al., 2004).

75
76
4 CONCLUDING REMARKS

Cognitive services can be compounded and used as building blocks for a


cognitive application. Programming interfaces targeted to contribute to
reaching the goal constitute building blocks to construct more significant
functional entities. There are 20 constructions of which 12 are based on
cognitive services (Table 3).

Table 3. Constructions and used cognitive services: PI = IBM Personality


Insights, TA = IBM Tone Analyzer, NLU = IBM Natural Language
Understanding, AL = Alchemy Language, RR = IBM Retrieve and Rank, TAO
= IBM Tradeoff Analytics, D = IBM Discovery, VR = IBM Visual Recognition,
and SR = Microsoft Speaker Recognition .

P Construction PI TA NLU AL RR TAO D VR SR

II Mapping between x
principles of business
analytics and personality
insights
II Mapping between x
personality traits and
expected experience
III Mapping between cognitive x x x x x x
service capabilities and
work activity automation
capabilities
III Mapping between x x x x x x
automation capabilities and
human cognitive processes
to cognitive services
III Rules of the capabilities x x x x x x
provided using cognitive
services to facilitate human
cognition
IV Workflow experiment for x x x
value propositions

77
IV Rules for transforming x
personality traits into
questionnaires
V Abstraction to cognitive x x x
functions functional
hierarchy
VI Workflow experiment from x
tagged messages to
personality insight
VII Experiment of a method for x
use and assess cognitively
computed semantic roles
IX Framework to achieve x x
measurable advantages in
digitalisation
IX Context-aware information x x
transparency and smart
indicators

Fifteen constructions of the thesis signify a structure of parts that can be


expanded or specialised to support something in a certain context (Table 4).
Checkmarks are added in the contribution line of the table according to the
reason that supports the contexts (i.e., ambidexterity, cognitive computing,
and insights).

Table 4. Context-related constructions: A = ambidexterity, CC= cognitive


computing, I = insights.
P Construction Reason A CC I

I Abstraction of Improve understanding of the big x


uncovering data phenomenon and the
information nuggets possibilities of data analytics
from heterogeneous
data as part of
competitive
advantage
I Example information Presents insights that use indicators x
nuggets for indicators

78
II Abstraction of two- Illustrates that meaningful x
way transparency information (a.k.a., insights) has been
between selected data derived based on business objectives
and principles
II Mapping between Utilises ambidexterity value chains of x x
business analytics and Bøe-Lillegraven (2014) and business
ambidexterity value analytics of Marr (2016) to clarify
chains objective Insights
II Mapping between Support for the insight and verify x
principles of business cognitive service by the transparency
analytics and between cognitive related processed
personality insights data and principle-based metrics
realisation
II Mapping between Presents that behaviour-centric x
personality traits and insights can be used for value
expected experience propositions
III Abstraction of the Demonstrates how automation x
utilisation mindset of releases resources as well as refers to
cognitive service the exploitation of cognitive services
outcomes and demonstrates insights that need
to be further processed with or
without human intervention as well as
demonstrates actionable insights that
must carry out something
IV Workflow experiment Presents cognitive service value x
for value propositions propositions
V Framework for the Provides evidence-based guidelines to x
functional hierarchy determine what types of computing
of cognitive functions functions could be considered
cognitive
V Abstraction to Compares the similarities of the x
cognitive functions functions in cognitive computing and
functional hierarchy human cognitive functions as well as
compares the similarities of the
functions and summarises the
differences and similarities between
the functions of the cognitive services
and applications that are not defined
as cognitive (NDC)
VI Workflow experiment Clarifies whether the trait insights of x
from tagged the Personality Insights service can be

79
messages to used in behavioural interventions and
personality insight how behavioural interventions can be
amplified from text-based trait
insights
VII Experiment of a Refers to the use of the cognitive x
method for use and service to parse sentences into
assess cognitively semantic roles, Atlas.ti to validate
computed semantic potential stakeholders before they
roles are automatically coded to, and
explore co-occurrences
VIII Low-code autoML- Increase data understanding. Refers x x
augmented data to things that aid insights by low-code
pipeline experiment autoML results that can be used with
BI tools
IX Framework to achieve Clarifying traceability in transparency x
measurable to bring objective insights closer to
advantages in business objectives as well as those
digitalisation technological advantages can be
harnessed to business use with data-
driven strategies wherein cognitive
services are integrated into smart
things
IX Context-aware Refers to the ability of cognitive x
information services to infuse the knowledge of
transparency and service and business objectives into
smart indicators the production line

Only one construction adapted the ambidexterity framework of Bøe-


Lillegraven (2014). However, exploitation and exploration can benefit
automation when organisations confront the problem of narrow resources.
Two constructions adapted cognitive functions to cognitive computing.
One construction addressed automated ML, and ML was included in
cognitive computing. Cognitive computing concerns sophisticated
algorithms and device architecture such as deep learning and platforms
inspired by human neural networks. However, verification of the cognitively
computed outcomes is challenging.
There were 13 constructions that pointed out that insights can be formed
or manifested in different ways and that they can be either derived or

80
inferred. Cognitively computed insights must be interpretable. Therefore,
the requirement of open and standardised measures in the pipelines from
raw data to traceable results must be emphasised. BI tools support
transparency for each step of data processing, thereby aiding human
cognition.
The research process phases can be used when organisations establish
proofs of concept before adapting new building blocks or technology
(Section 4.1). After that, the construction-based answers for the research
questions are derived (Section 4.2). Examples of the construction-based
implications are listed, and the validity of the research results is evaluated
(Section 4.3). Finally, future research issues are traced out (Section 4.4).

4.1 Power of the research process phases

This dissertation highlights the exploration of automation capabilities and


human cognition to help discover meaningful use cases for cognitive
services. Proofs of concept are required to determine whether something
fits for purpose and use. Hence, it is reasonable to use a wide range of
technological solutions to seek and exploit business goals and recognise
business threats and opportunities (Bresciani et al., 2018; Lucas and Goh,
2009). Technology concerning cognitive computing transforms both
disciplines and industries (Kelly, 2015a). Further, cognitive computing
aggregates collective and computational intelligence (Kaltenrieder et al.,
2015) and provides information about decisions (Hoffenberg, 2016). In
general, organisations must be aware of what to use, what can be gained,
and how to assess and amplify technology adaptions (Figure 13).

81
Figure 13. Research process phases and research questions framed for
construction mappings.

The purpose of the research process phases (i.e., hype framing/landing,


functional framing, content framing, technical framing, and continuous
impact assessment) is to assist in the discovery of cognitively calculated
objective actionable insights in organisations: hype framing/landing (Papers
I and II), functional framing (Paper III), content framing (Papers IV, V, VII, VIII
and IX), technical framing (Papers VI, VIII, and IX), and continuous impact
assessment (Papers I–IX).
Hype framing/landing aims to increase understanding of the main
concerns of exploitation and exploration and ongoing simultaneous impact
assessment. The examples of cognitive services in this step aim to augment
the understanding of possibilities to use known personality traits as well as
to reveal information nuggets from the heterogeneous data. The thesis
presents experiments of low-code autoML-augmented data pipeline and
cognitive services in IIoT.
In functional framing, the aim is to increase understanding of work
activities and human cognitive processes as well as to identify and
understand stakeholders’ desires and needs in constructing value
propositions. The cognitive services examples in this step aim to augment

82
the understanding of corresponding automation capabilities and cognitive
processes.
Content framing aims to increase understanding of the ground truth
combined with ongoing impact assessment. In this phase, the examples of
cognitive services aim to augment understanding about human cognition
amplifications concerning textual data, visual data, and audio data as well as
understanding of word-based traits without ground truth and text
extractions. Augmentation of human understanding with low-code autoML
frameworks can be used to form models from the ground truth data that
are compatible with BI tools (e.g., a pipeline from raw data to obtain a
deeper understanding of meaningful field classes in tabular data). The
model and ground truth assessment can also be supplemented with
context-related business indicators, and cognitive services can be integrated
into smart things to augment human understanding.
Technical framing determines the required technical methods to reach
and verify desired results. This research process uses examples of cognitive
services. The phase aims to understand the word vectors behind feature
analysis combined with ongoing impact assessment as well as to find and
augment understanding concerning meaningful fields, for example tabular
data through low-code autoML frameworks using BI tools. Business model
canvas (BMC) and lean and value propositions can help clarify the objective
targets. Cognitive service outcomes can serve the situational knowledge
used to reach the goal, and integrated BI and analytics tools can assess the
impacts and make predictions.
Further, this dissertation underlines awareness (continuous impact
assessment) of the whole circle of cognised, governed, and compliant data
usage that needs to be considered in the search for insights. The major
contributions concerning framing cognitively computed insights are the
functional hierarchy of cognitive functions and the utilisation mindset. These
constructions help with comparison and thus the evaluation of the impact
of the applications, for example, they are useful in organisational duality in
between exploration and exploitation, estimating impact, and illustrating
how automation releases sources with or without human intervention. In

83
general, organisations must be aware of what to use, what can be gained,
and how to assess and amplify technology adaption.

4.2 Answers to the research questions

Twenty constructions are used to answer the research questions (Table 5):
four contractions are abstractions; nine constructions are experiments;
three constructions are frameworks; and six contractions are mappings.
Eight constructions help to determine correspondences between the
cognitive services and human cognitive functions. Six constructions can be
used to understand the meaning of human cognition amplification within
the cognitively computed insights. Six constructions help assess the impacts
of the cognitively computed data for organisations.

Table 5. Constructions and the main aim of utilisation as well as research


questions: P = paper, A = abstraction, E = experiment, F = framework, M =
mapping, Q1= cognitive capabilities, Q2= human cognitive amplification,
Q3= impact assessment.

P Construction A E F M The main aim of Q1 Q2 Q3


utilisation
I Abstraction of x Understand what can be x
uncovering calculated, what is
information worth calculating, and
nuggets from what and why is it
heterogeneous calculated
data as part of
competitive
advantage
I Example x Understand key x
information questions of data
nuggets for milling (for finding
indicators indicators): what
happened, why it
happened, what is
happening, why it is

84
happening, what will
happen, why it will
happen, what should be
done, why it should be
done.
II Abstraction of x Find transparency and x
two-way understanding of data-
transparency to-principles and
between selected principles-to-data to
data and monitor performance
principles
II Mapping between x Create transparency x
business analytics between value
and ambidexterity proposition and
value chains business analytics
based on the
ambidexterity value
chain framework
II Mapping between x Create transparency x
principles of between business
business analytics analytics measure and
and personality refined personality
insights insight
II Mapping between x Help understand and x
personality traits amplify customer
and expected experience based on
experience personality traits
III Mapping between x Find the x
cognitive service correspondences
capabilities and between cognitive
work activity service capabilities and
automation work activities
capabilities
III Mapping between x Find the x
automation correspondences
capabilities and between human
human cognitive cognitive processes and
processes to work activities
cognitive services
III Rules of the x Describe the capabilities x
capabilities of cognitive services to

85
provided using facilitate human
cognitive services cognition
to facilitate
human cognition
III Abstraction of the x Determine and facilitate x
utilisation possibilities for the
mindset of inferred insights
cognitive service
outcomes
IV Workflow x Find evidence and x
experiment for arguments concerning
value the value proposition of
propositions the cognitive services to
support human
cognition
IV Rules for x Find the ground truth x
transforming concerning the value
personality traits proposition of the
into cognitive services to
questionnaires support human
cognition
V Framework for x Reach comparability x
the functional and correspondence
hierarchy of between human
cognitive cognitive functions and
functions cognitive functions of
the cognitive services
V Mapping human x Construct larger x
cognitive functional entities by
functions onto using cognitive
groups of functions as building
cognitive blocks
functions
V Abstraction to x Find similarities either x
cognitive between human
functions cognitive functions and
functional applications or between
hierarchy applications
VI Workflow x Find transparency and x
experiment from understanding
tagged messages

86
to personality concerning corpus-
insight based insights
VII Experiment of a x Use cognitive services to x
method for use manifest indicative
and assess semantic roles
cognitively
computed
semantic roles
VIII Low-code x Use autoML and BI tools x
autoML- to manifest meaningful
augmented data fields. Build AI
pipeline visualisations, reports,
experiment and low code
applications to augment
human cognition
IX Framework to x Use the framework to x
achieve determine the needs
measurable and needed
advantages in measurements,
digitalisation combine the situational
information to use to
support decision
making and
transparency
IX Context-aware x Combine the target x
information value factors with
transparency and cognitive service and
smart indicators smart things

Cognitive capabilities: What are the correspondences between human


cognition and cognitive services? When the correspondences between
human cognition and cognitive services are accumulated (Question 1),
abstractions and mappings are the main construction types. The research
question concerning cognitive capabilities asked what the correspondences
are between human cognition and cognitive services. Automation
capabilities concerning work activities are used to explain the
correspondences of cognitive service capabilities and work activities and to
explain the correspondences of the human cognitive processes and work

87
activities. Further, the capabilities of cognitive services are used to facilitate
human cognition. Similarities can be found in all groups of human cognitive
functions. Awareness of both capabilities and functionalities of the cognitive
services contributes to fulfilling the system requirements.
Human cognitive amplification: How to amplify human cognition within
the cognitively computed insights? When the methods to amplify human
cognition in cognitively computed insights are collated (Question 2), they are
defined to be manifested either by cognitive services or automated ML
frameworks; experiments and frameworks are the main construction types.
The research question concerning human cognitive amplification asked how
to amplify human cognition within the cognitively computed insights. The
outcomes of the cognitive services can be implemented as building blocks
to construct more significant functional entities. If the users or other
stakeholders' experiences are critical success factors, then personality trait-
based behaviour management is highlighted. Further, cognitive services can
enhance insights on a product, process, or service, and therefore they can
be targeted to meet the needs of stakeholders and companies. Above all,
awareness of the cognitive similarities between humans and applications
promotes targeting the outcomes of cognitive services.
Impact assessment: How can the impacts of the cognitively computed
data for organisations be assessed? The construction type is an abstraction,
framework, or mapping when assessing impacts of the cognitively
computed data for organisations that have been gathered together
(Question 3). The research question concerning impact assessment asked
how to assess the impacts of the cognitively computed data for
organisations. Performance monitoring requires transparency from
strategy to operational efforts. Moreover, cognitively computed insights
must be authorised, at least, to manifest something. The ground truths
might be difficult to determine. Integrating BI and analytics with cognitive
services can contribute knowledge of impacts and add transparency for the
pipeline from raw data to insight. However, awareness of the inferred data
or insights must be increased. For example, evaluation experiments

88
increase awareness of capabilities and functionalities within reliability
issues.

4.3 Practical implications

The discovered correspondences in the framing of this dissertation are


based on the constructions. Further, this dissertation does not enumerate
all possible computer cognitive functions but describes their functional
ideas (Paper V). Hence, it allows the implementer to follow the idea of
building blocks of cognitive functions and construct value addition. From the
framing can be observed that human cognition amplifications affect how the
impacts can be assessed. As a conclusion based on the results, well-
augmented, transparent, objective, and actionable cognitive data usage with
or without interventions is built in these arguments’ nature, result in usage
impact is self-evident. However, the organisations must address their
competitiveness and effectiveness by adopting objective insights with the
help of cognitive services. It can be implied whether required cognitive
capabilities can be fulfilled by humans or by cognitive services, for example,
in the following ways:
• List work activities and evaluate what kind of cognitive capabilities
are required (Paper III: rules of the capabilities provided using
cognitive services to facilitate human cognition).
• Divide the activities’ functionalities into simple functions as building
blocks since human cognition results from multi-functional
cooperation (Paper V: mapping human cognitive functions onto
groups of cognitive functions).
• Use the framework for the functional hierarchy of cognitive functions
to find the correspondences of functionality in the cognitive service
(Paper V: framework for the functional hierarchy of cognitive
functions).
• Use utilisation mindset to clarify the automation possibilities (Paper
III: abstraction of the utilisation mindset of cognitive service
outcomes).

89
• Amplify human cognition by showing transparent results from raw
data to insights (Paper VIII: low-code autoML-augmented data
pipeline experiment; Paper IX: context-aware information
transparency and smart indicators).
• Amplify human cognition by using descriptive statistics to find
unfamiliar facts based on information nuggets; then, model the
material and use inferential statistics (Paper I: example information
nuggets for indicators).
The outcomes of the cognitive services as building blocks can be implied
to construct more significant functional entities, for example in the following
ways:
• List work activities and evaluate whether they will be automated
using cognitive services (Paper III: abstraction of the utilisation
mindset of cognitive service outcomes).
• Functionality can be reconciled with human cognition, thus helping
find the necessary reinforcements that correspond to human
cognition (Paper V: framework for the functional hierarchy of
cognitive functions).
The proposed constructions to assess impacts of the cognitively
computed insights can be implied, for example, in the following ways:
• List key questions concerning business analytics and determine
whether there are cognitive services that can be used to provide
answers for the questions (Paper I: abstraction of uncovering
information nuggets from heterogeneous data as part of competitive
advantage; Paper I: example information nuggets for indicators;
Paper II: mapping between business analytics and ambidexterity
value chains; Paper II: mapping between principles of business
analytics and personality insights; Paper IX: a framework to achieve
measurable advantages in digitalisation).
• Manifest personality traits of employees and other representatives of
the stakeholders can be used to manage behaviour or to plan better
experiences concerning interactions of different kinds (Paper II:
mapping between personality traits and expected experience; Paper

90
VI: workflow experiment from tagged messages to personality
insight).
• Manifest transparent data-driven objectives can be enhanced by
cognitive services (Paper IX: achieve measurable digitalisation
advantages and context-aware information transparency and smart
indicators).
• List work activities and evaluate whether they will be automated using
cognitive services (Paper III: abstraction of the utilisation mindset of
cognitive service outcomes).
• Assess the impact of the cognitive functions in applications between
new building blocks or technology and earlier systems in place (Paper
III: mapping between cognitive service capabilities and work activity
automation capabilities).

Accuracy of the results was checked using three aspects of validity


(Runeson et al., 2012, 71). Construct validity concerns the used concepts,
and it asks whether the researcher has understood concepts the meanings
of which support the research questions. Internal validity refers to causal
relations, whether the researcher has understood what factors affect the
studied factor. External validity refers to the generalising of the results. In
qualitative studies, instead of generalisation, the usability of the results can
be preferred in similar cases. Guidelines for validity evaluation of research
results presented by Runeson et al. (2012) were used in Papers I, III, and VII.
In this dissertation, construct validity issues concern the selected
frameworks and cognitive services that affect the constructions of whether
proofs of concept by cognitive services were used to frame the cognitively
computed insights. The construction was defined as a structure of the
explained parts that can be extended or specialised in supporting
something. The established constructions are based on well-established
frameworks having defined concepts/terms and relationships between the
concepts/terms. Moreover, constructions based on the selected commercial
cognitive services represented and still represent the most popular services.
Additional studies on this research process for establishing proofs of

91
concept are needed so that the quality of the process may be further refined
and improved through quantitative proofs.
The chosen research methods affected both the reliability and validity of
the results. The results are mainly subjective because qualitative methods
were used.

4.4 Future research issues

Enhance the mapping between automation capabilities and human


cognitive processes to cognitive services. Additional studies on traceability
are needed to determine open and standardised measures to reach valid
and reliable insights. This research concept needs future refinements, and
more research cases are required to test and enhance the phases, for
example case-based requirements and arguments. McKinsey's automation
ability, supplemented by cognitive processes, did not help utilise cognitive
services as work activities. However, the cognitive equivalence achieved
between the automation capabilities can be useful in selecting a cognitive
function with the required capabilities of the system as part of the system’s
building blocks.
Enhance the framework of the functional hierarchy . The cognitive
services that contain cognitive functions are building blocks of cognitive
systems. The findings regarding the computational cognitive functions are
consistent with the cognitive computing literature (Srivathsan and Arjun,
2015; Chen et al., 2016; Williamson, 2017; IBM, 2021a). Cognitive functions
gain weight, for example, in each category level. This improvement would
enable us to refine descriptions of functions, thus enhancing comparability.
Further, research could be deepened by using the hierarchy of cognitive
functions as an intermediary between work activity functions and
computational cognitive functions, thus producing more precise matches
and suggestions for cognitive function versus automation capabilities.
Enhance proofs of concept by common controls of authority documents
and ethical issues. If the cognitive services concerning personality traits and
other individual characteristics are utilised, then common control of the

92
authority documents and ethical issues must be taken into consideration. It
might be more acceptable to use, for example, personality trait-based
profiling if the manifesting solutions are transparent from raw data to traits.
Moreover, preventive behaviour management might require combining
individual data without fear of the fees based on privacy regulations such as
the General Data Protection Regulation (GDPR).
Enhance proofs of concept by curated tools and techniques. For example,
well-established corpora and IPIP items can be used in embeddings so that
IPIP items can be related to personality characteristics applied in
embedding.
Enhance argumentative and transparent insights. ArchiMate is a
modelling language and is the enterprise architecture standard; it is part of
Open Group and is fully aligned with TOGAF (the Open Group architecture
framework) (Beauvoir et al., 2021). ArchiMate can describe strategy
elements such as resources, capabilities, value streams, and actions;
therefore, this approach can add transparency to existing and new
technological solutions.

93
94
5 BIBLIOGRAPHY

Abadi M.,Agarwal A., Barham P., Brevdo E., Chen Z., Citro C., Corrado G. S.,
Davis A., Dean J., Devin M., Ghemawat S., Goodfellow I.,Harp A., Irving
G., Isard M., Jozefowicz R., Jia Y., Kaiser L., Kudlur M., Levenberg J.,
Mané D., Schuster M., Monga R., Moore S., Murray D., Olah C., Shlens
J., Steiner B., Sutskever I., Talwar K., Tucker P., Vanhoucke V.,
Vasudevan V., Viégas F., Vinyals O., Warden P., Wattenberg M., Wicke
M., Yu Y., Xiaoqiang Zheng. 2015. TensorFlow: Large-scale machine
learning on heterogeneous systems. Software available from
tensorflow.org
Ahlemeyer-Stubbe A., Coleman, S. 2014. A Practical Guide to Data Mining
for Business and Industry. Wiley, Hoboken, 35–36.
Ansoff I. 1984. Implanting Strategic Management. Prentice/Hall
International Inc.
Baškarada S., Koronios A. 2013. Data, information, knowledge, wisdom
(DIKW): a semiotic theoretical and empirical exploration of the
hierarchy and its quality dimension. Australasian Journal of
Information Systems, 18(1): 1–24.
Beauvoir P., Sarrodie J-B., The Open Group. 2021. Archi the free ArchiMate
Modelling Tool.
https://www.archimatetool.com/downloads/Archi%20User%20Guide.
pdf
Beller C., Katz G., Ginsberg A., Phipps C., Bethard S., Chase P., Shek E.,
Summers K. 2016. Watson Discovery Advisor: Question-answering in
an industrial setting. In Proceedings of the Workshop on Human-
Computer Question Answering, 1–7.
Blei D. M. 2012. Probabilistic Topic Models. Communications of the ACM,
55(4): 77–84.
Blumberg R., Atre S. 2003. The problem with unstructured data. DM
Review, 13(62): 42–49.

95
Bøe-Lillegraven T. 2014. Untangling the ambidexterity dilemma through big
data analytics. Journal of organization design, 3(3): 27–37.
Bresciani S., Ferraris A., Del Giudice M. 2018. The management of
organizational ambidexterity through alliances in a new context of
analysis: Internet of Things (IoT) smart city projects. Technological
Forecasting & Social Change, 136: 331–338
Burgelman R. A. 1988. Strategy making as a social learning process: The
case of internal corporate venturing. Interfaces, 18(3): 74–85.
Cambridge University Press. 2021. Cambridge Dictionary.
http://dictionary.cambridge.org/dictionary/english/
Chen Y. 2017. Dynamic ambidexterity: How innovators manage exploration
and exploitation, Business Horizons, 60(3): 385–394.
Chen Y., Argentinis J. D. E., Weber G. 2016. IBM Watson: how cognitive
computing can be applied to big data challenges in life sciences
research. Clinical therapeutics, 38(4): 688–701.
DeFranco J. F., Laplante P. A. 2017. A content analysis process for
qualitative software engineering research. Innovations in Systems
and Software Engineering, 13(2): 129-141.
De Raad B. 2000. The Big Five Personality Factors: The psycholexical
approach to personality. Hogrefe & Huber Publishers.
De Saussure F. 1983. Course in general linguistics. London, Gerald
Duckworth & Co.
Dictionary.com. 2021. Rock Holdings Inc,
https://www.dictionary.com/browse/search
Duncan R. B. 1976. The ambidextrous organization: Designing dual
structures for innovation. The management of organization, 1(1): 167-
188.
Earley S. 2015. Cognitive computing, analytics, and personalization. IT
Professional, 17(4): 12–18.
ElBedwehy M. N., Ghoneim M. E., Hassanien A. E., Azar, A. T. 2014. A
computational knowledge representation model for cognitive
computers. Neural Computing and Applications, 25(7-8), 1517-1534.

96
European Commission (EC). 2016. Regulation (EU) 2016/679 of the
European Parliament and of the Council of 27 April 2016 on the
Protection of Natural Persons with Regard to the Processing of
Personal Data and on the Free Movement of Such Data, and
Repealing Directive 95/46/EC (General Data Protection Regulation)
[online] https://ec.europa.eu/info/law/law-topic/data-protection/data-
protection-eu_en.
Ferrucci D., Brown E., Chu-Carroll J., Fan J., Gondek D., Kalyanpur A. A., Lally
A., Murdock J. W., Nyberg E., Prager J., Schlaefer N., Welty C. 2010.
Building Watson: An Overview of the DeepQA Project. AI Magazine,
31(3): 59–79.
Garousi V., Felderer M., & Mäntylä M. V. 2019. Guidelines for including grey
literature and conducting multivocal literature reviews in software
engineering. Information and Software Technology, 106: 101–121.
Google. 2013. word2vec. https://code.google.com/archive/p/word2vec/
Grammarly. 2020. Meet Grammarly’s tone detector.
https://www.grammarly.com/tone
Hachey B., Radfordb W., Nothmanb J., Honnibal M., Curran J. R. 2013.
Evaluating entity linking with Wikipedia. Artificial Intelligence 194:
130–150.
Hiltunen E. 2010. Weak signals in organizational futures learning. Aalto
University School of Economics.
Hoffenberg S. 2016. What’s the difference between artificial intelligence
and cognitive computing. IBM’s Watson Answers the Question.
http://www.vdcresearch.com/News-events/iot-blog/IBM-Watson-
Answers-Question-Artificial-Intelligence.html
Hofman T. 2013. Probabilistic Latent Semantic Analysis, Cornell University
Library arXiv.org, arXiv:1301.6705, 1:2289-296.
Iafrate F. 2018. Artificial Intelligence and Big Data: The Birth of a New
Intelligence. Wiley Online Library, Volume 8 (2). https://onlinelibrary-
wiley-com.ezproxy.uef.fi:2443/doi/book/10.1002/9781119426653
IBM 2017a. About Alchemy Language.
https://www.ibm.com/watson/services/alchemy-language-migration/

97
IBM 2017b. IBM Watson Retrieve and Rank tool. Dale Lane.
https://www.youtube.com/watch?v=SveIksv7V9E 2min18sek.
IBM 2017c. The science behind the service, Trade off analytics.
https://www.ibm.com/blogs/cloud-archive/2015/05/watson-tradeoff-
analytics/
IBM 2018. Natural Language Understanding. https://natural-language-
understanding-demo.ng.bluemix.net
IBM 2019. Retrieve and Rank.
https://www.ibm.com/watson/services/retrieve-and-rank/
IBM 2021a. IBM Cloud Docs, https://cloud.ibm.com/docs/search?
IBM 2021b. IBM Watson Natural Language Understanding Text Analysis.
https://www.ibm.com/demos/live/natural-language-
understanding/self-service/home
IBM 2021c. IBM Cloud products. https://www.ibm.com/cloud/products
IEEE Computer Society and ISO/IEC JTC 1/SC7. 2021. SEVOCAB: Software
and Systems Engineering Vocabulary,
https://pascal.computer.org/sev_display/index.action
International Organization for Standardization, ISO. (2021), ISO Online
Browsing Platform (OBP). https://www.iso.org/obp/ui
International Personality Item Pool (IPIP). 2017. A Scientific Collaboratory
for the Development of Advanced Measures of Personality Traits and
Other Individual Differences. http://ipip.ori.org,
https://ipip.ori.org/newscoringinstructions.htm
Johnson J. A. 2017. Big-Five model. In V. Zeigler-Hill, T.K. Shackelford (Eds.),
Encyclopedia of Personality and Individual Differences (1-16). New
York: Springer. DOI: 10.1007/978-3-319-28099-8_1212-1
March J. G. 1991. Exploration and Exploitation in Organizational Learning.
Organization Science, 2(1): 71–87.
Katila R., Ahuja G. 2002. Something old, something new: A longitudinal
study of search behavior and new product introduction. Academy of
management journal, 45(6): 1183–1194.

98
Kaltenrieder P., Portmann E., Myrach T. 2015. Fuzzy knowledge
representation in cognitive cities, IEEE International Conference on
Fuzzy Systems, 1–8.
Kalyanpur A., Patwardhan S., Boguraev B. K., Lally A., Chu-Carroll J. 2012.
Fact-based question decomposition in DeepQA. IBM Journal of
Research and Development, 56(3.4) 13:1–13.
Kaufman M., Bowles A., Hurwitz JS., Hurwitz J. 2015. Cognitive Computing
and Big Data Analytics, John Wiley & Sons, Incorporated, Somerset.
Available from: ProQuest Ebook Central. [9 July 2020].
Kauppila O. P. 2010. Creating ambidexterity by integrating and balancing
separate interorganizational partnerships. Strategic organization,
8(4): 283–312.
Kelly J. 2015a. The Future of Cognitive computing.
https://www.youtube.com/watch?v=q7qElhGv7uY
Kelly J. E. 2015b. Computing, cognition and the future of knowing How
humans and machines are forging a new age of understanding, IBM
White Paper.
http://researchweb.watson.ibm.com/software/IBMResearch/multime
dia/Computing_Cognition_WhitePaper.pdf
https://www.academia.edu/24586152/Computing_cognition_and_the_
future_of_knowing_How_humans_and_machines_are_forging_a_new_
age_of_understanding
Kitchenham B., Charters S. 2007. Guidelines for performing Systematic
Literature Reviews in Software Engineering. EBSE Technical Report
EBSE-2007-01.
Krug M., Fabian Wiedemann F., Gaedke M. 2014. Enhancing media
enrichment by semantic extraction. In Proceedings of the 23rd
International Conference on World Wide Web (WWW '14 Companion).
Association for Computing Machinery, New York, NY, USA, 111–114.
DOI: https://doi.org/10.1145/2567948.2577025
Langley D. J., van Doorn J., Ng I. C., Stieglitz S., Lazovik A., Boonstra A. 2021.
The Internet of Everything: Smart things and their impact on business
models. Journal of Business Research, 122, 853-863.

99
Lanigan R. L. 1994. Capta versus data: method and evidence in
communicology. Phenomenology in Communication Research,
Human Stud. 17(1), 109–130.
Laukkanen S. 2012. Making sense of ambidexterity: A Process View of the
Renewing Effects of Innovation Activities in Multinational Enterprise,
Publications of the Hanken School of Economics Nr 243.
Lexico.com – Oxford Dictionaries – perception. 2018.
https://en.oxforddictionaries.com/definition/perception
Louridas P., Ebert C. 2016. Machine Learning, IEEE Software, The IEEE
Computer Society, 110-115.
Lucas Jr H. C., Goh, J. M. 2009. Disruptive technology: How Kodak missed
the digital photography revolution. The Journal of Strategic
Information Systems, 18(1): 46-55.
Marchevsky A. M., Walts, A. E., & Wick, M. R. 2017. Evidence-based
pathology in its second decade: to-ward probabilistic cognitive
computing. Human pathology, 61, 1-8.
McKinsey Global Institute. 2017. A future that works: automation,
employment, and productivity. McKinsey Global Institute Research,
Tech. Rep, 60. https://www.mckinsey.com/featured-insights/digital-
disruption/harnessing-automation-for-a-future-that-works.
Manning C. D., Schültze H. 1999. Foundations of Statistical natural
language processing. MIT Press 5: 141–177.
Manning C. 2017. Information Extraction and Named Entity Recognition,
Natural Language Processing.
https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1106/hand
outs/InfoExtract-cs224n-2010-1up.pdf
Mardjan M. 2020. Free and Open Machine Learning Release 1.0.1
https://editorialia.com/2020/02/03/free-and-open-machine-learning-
documentation-release-0-3/
Marr B. 2016. Key business analytics – the 60+ business analytics tools
every manager need to know. Wiley.

100
Maurya A. 2017. Why lean canvas vs business model canvas?
https://blog.leanstack.com/why-lean-canvas-vs-business-model-
canvas-af62c0f250f0.
Maybury M. T. 2004. New Directions in Question Answering. Cambridge
MA: The MIT Press.
Merret R. 2015. Doing natural language processing with neural nets
without the high costs. CIO, IDG Communications
http://www2.cio.com.au/article/574206/doing-natural-language-
processing-neural-nets-without-high-cost/
Moez A. 2020a. Machine Learning in Power BI using Pycaret. Towards data
science https://towardsdatascience.com/machine-learning-in-power-
bi-using-pycaret-34307f09394a
Moez A. 2020b. PyCaret: An open source, low-code machine learning
library in Python, PyCaret version 1.0.0. https://www.pycaret.org
Mosquera A., Lloret E., Moreda P. 2012. Towards facilitating the
accessibility of web 2.0 texts through text normalisation. In
Proceedings of the LREC Workshop: Natural Language Pro-cessing for
Improving Textual Accessibility (NLP4ITA).
http://lrec.elra.info/proceedings/lrec2012/workshops/25.NLP4ITA-
Proceedings.pdf#page=14
Microsoft (MS). 2017b. Speaker Recognition API. Demonstration.
https://azure.microsoft.com/en-us/services/cognitive-
services/speaker-recognition/
Microsoft (MS). 2020a. What is speech-to-text? Microsoft Azure.
https://docs.microsoft.com/en-us/azure/cognitive-services/speech-
service/speech-to-text
Microsoft (MS). 2020b. How to: Detect sentiment using the Text Analytics
API. https://docs.microsoft.com/en-us/azure/cognitive-services/text-
analytics/how-tos/text-analytics-how-to-sentiment-
analysis?tabs=version-3
Microsoft (MS). 2021. Azure Cognitive Services. Microsoft Azure.
https://azure.microsoft.com/en-us/services/cognitive-
services/?v=17.29

101
Mussalo P., Gain U., Hotti V. 2018. Types of Data Clarify Senses of Data
Processing Purpose in Health Care. In EFMI-STC, 55-59.
Nahmias M. A., Shastri B. J., Tait A. N., Prucnal P. R. 2013. A leaky integrate-
and-fire laser neuron for ultrafast cognitive computing. IEEE Journal
of selected topics in quantum electronics, 19(5), 1-12.
Neuendorf K.A. 2002. The Content Analysis Guidebook, SAGE Publications,
Thousand Oaks, CA, 156.
Okoli C., Schabram K. 2010. A Guide to Conducting a Systematic Literature
Review of Information Systems Research. Sprouts: Working Papers on
Information Systems 10(26):1-50.
O’Reilly III C. A., Tushman, M. L. 2004. The ambidextrous organization.
Harvard Business Review, 82(4), 74-81.
O’Really III C. A., Tushman M. L. 2013. Organizational ambidexterity: past,
present, and future. The Academy of Management Perspectives 27(4):
324–228.
Osterwalder A. 2004. The business model ontology a proposition in a
design science approach. Thèse de doctorat : Université de Lausanne.
Osterwalder A. 2012. Achieve Product-Market Fit with our Brand-New Value
Proposition Designer. https://www.patrinum.ch/record/15985/usage
Pang B., Lee L. 2008. Opinion mining and sentiment analysis. Foundations
and Trends in Information Retrieval, 2(1–2): 1–135.
Pasca M. 2007. Lightweight web-based fact repositories for textual
question answering. In Proceedings of the sixteenth ACM conference
on Conference on information and knowledge management, 87–86.
PAT Research - Predictive analytics today. 2021. Predictive analytics today,
what is cognitive computing? Top 10 cognitive computing companies,
Predictive analytics today
https://www.predictiveanalyticstoday.com/what-is-cognitive-
computing/#content-anchor
Peirce C. S. 1868. Some Consequences of Four Incapacities, Journal of
Speculative Philosophy, 140–157.
Peng C. C., Lakis M., Pan J. W. 2015. Detecting Sarcasm in Text.
cs229.stanford.edu/proj2015/044_report.pdf

102
Pennington J., Socher R., Manning C. D. 2014. GloVe: Global Vectors for
Word Representation. In Proceedings of the 2014 conference on
empirical methods in natural language processing (EMNLP), 1532–
1543. https://nlp.stanford.edu/projects/glove/
Raisch S., Birkinshaw J. 2008. Organizational Ambidexterity: Antecedents,
Outcomes, and Moderators, Journal of Management 34(3): 375–409.
Rajaraman A. Ullman J. D. 2011. Mining of Massive Datasets, Cambridge
University Press, 1–19.
Random House Value Publishing. 1996. Webster’s Encyclopedic unabridged
dictionary of the English language. Library of Congress Cataloging-in-
Publication Data, Grammercy Books. ISBN: 0-517-15026-3
Rauhala L. 1995. Tajunnan itsepuolustus. Yliopistopaino.
Rauhala L. 2009. Ihminen kulttuurissa – kulttuuri ihmisessä. Gaudeamus.
Runeson P., Host M., Rainer A., Regnell B. 2012. Case study research in
software engineering: Guidelines and examples. John Wiley & Sons.
Salton G. 1988. Automatic Text Processing. Addison-Wesley Publishing
Company.
SAS. 2021. History and evolution of big data analytics.
https://www.sas.com/en_ca/insights/analytics/big-data-analytics.html
Simsek Z. 2009. Organizational ambidexterity: towards a multilevel
understanding. Journal of management studies, 46(4): 597–624.
Simsek Z., Heavey C., Veiga J. F., & Souder D. 2009. A typology for aligning
organizational ambidexterity's conceptualizations antecedents and
outcomes. Journal of management studies, 46(5): 864–894.
Srivathsan M., Arjun K. Y. 2015. Health monitoring system by prognotive
computing using big data analytics. Procedia Computer Science,
50:602-609.
Few Stephen (2015). Signal ISBN: 978-1-938377-05-1.
Takahashi M., Overton W. F. 2005. Cultural foundations of wisdom: An
integrated developmental approach, A Handbook of Wisdom:
Psychological Perspectives, Edition: 1, Sternberg R., Jordan J.
Cambridge University Press, 32–60.

103
Talkspace. 2020. Feeling better starts with a single message.
https://www.talkspace.com/
The Open Group. 2021. TOGAF. Content Metamodel
https://pubs.opengroup.org/architecture/togaf91-
doc/arch/chap34.html
Tushman M. L., O'Reilly III, C. A. 1996. Ambidextrous organizations:
Managing evolutionary and revolutionary change. California
management review, 38(4), 8-29.
Unified compliance. 2021. UCF compliance dictionary.
https://www.unifiedcompliance.com/education/compliance-
dictionary/
Wan X., Cenamor J., Parker G., Van Alstyne M. 2017. Unraveling platform
strategies: A review from an organizational ambidexterity
perspective. Sustainability, 9(5), 734.
Wang Y., Wang Y., Patel S., Patel D. 2006. A layered reference model of the
brain (LRMB). IEEE Transactions on Systems, Man, and Cybernetics,
Part C (Applications and Reviews), 36(2): 124–133.
Wang Y. 2009. Toward a Formal Knowledge System Theory and Its
Cognitive Informatics Foundations. Transactions on Computational
Science V, Special Issue on Cognitive Knowledge Representation,
Springer, 1–19.
Wang Y. 2015. Formal Cognitive Models of Data, Information, Knowledge,
and Intelligence. WSEAS Transactions on Computers, 14(3): 770–781.
Webber B., Webb N. 2010. Question Answering, The Handbook of
Computational Linguistics and Natural Language Processing.
Williamson B. 2017. Computing brains: learning algorithms and
neurocomputation in the smart city. Information, Communication &
Society, 20(1): 81–99.
Zimmermann A., Raisch S., Birkinshaw J. 2015. How Is Ambidexterity
Initiated? The Emergent Charter Definition Process. Organization
Science, 1119–1139.

104
6 PAPERS
Paper I
Authors: Hotti V, Gain U. (2013).

Article title: Big Data Analytics for Professionals, Data-milling for Laypeople.

Journal: World Journal of Computer Application and Technology, 1(2):51-57,

Publisher: Horizon Research Publishing.

Permissions from co-authors via email: Hotti Virpi received 9.11.2021 at 7:57

Copyright:
World Journal of Computer Application and Technology 1(2): 51-57, 2013 http://www.hrpub.org
DOI: 10.13189/wjcat.2013.010205

Big Data Analytics for Professionals,


Data-milling for Laypeople
Ulla Gain1,*, Virpi Hotti2

1
Foster Wheeler Energia Oy, 78201 Varkaus, Finland
2
Department of Computer Science, University of Eastern Finland, 70211, Kuopio, Finland
*Corresponding Author: [email protected]

Copyright © 2013 Horizon Research Publishing All rights reserved.

Abstract There exist large amounts of heterogeneous has been created in the last two years alone” [4]) and their
digital data. This phenomenon is called Big Data which will analysis (“3% of the potentially useful data is tagged, and
be examined. The examination of Big Data has been even less analyzed” [5]).
launched as Big Data analytics. In this paper, we present the There are many challenges in examining heterogeneous
literature review of definitions for Big Data analytics. The (i.e. diverse) data. We want to use the term data-milling to
objective of this review is to describe current reported illustrate examining heterogeneous data. Before we launch
knowledge in terms of what kind of Big Data analytics is the term data-milling (Section 3), we have to research how
defined in the articles that can be found in ACM and IEEE the term Big Data analytics is defined (Section 2).
Xplore databases in June 2013. We found 19 defining parts Therefore, we make literature review. Furthermore, we
of the articles for Big Data analytics. Our review shows that research an example from energy industry and discuss
Big Data analytics is verbosely explained, and the through the example (Section 4). The example is connected
explanations have been meant for professionals. Furthermore, to the decision-making from the coal power plants
the findings show that the concept of Big Data analytics is investments in Europe.
unestablished. Big Data analytics is ambiguous to the Our research strategy is partly descriptive and partly
professionals - how would we explain it to laypeople (e.g. improving. Our literature review of Big Data analytics
leaders)? Therefore, we launch the term data-milling to describes current status of the phenomenon Big Data.
illustrate an effort to uncover the information nuggets. Through launching the term data-milling we try to improve
Data-milling can be seen as an examination of heterogeneous understanding of the phenomenon Big Data, as well as,
data or as part of competitive advantage. Our example possibilities of data analytics.
concerns investments of coal power plants in Europe.
Keywords Big Data Analytics, Literature Review, 2. Literature Review of Big Data
Data-milling, Information Nugget Analytics
Reviews of research literature are conducted to provide
"a theoretical background for subsequent research", to learn
“the breadth of research on a topic of interest” and to
1. Introduction answer “practical questions by understanding what existing
research has to say on the matter” [6]. We do not make
Big Data has as a term appeared literally first times systematic literature review as “a form of secondary study
towards the end of the 1990's [1]. In the year 2012, that uses a well-defined methodology to identify, analyze
Chaudhuri[2] crystallizes the term as follows: “Big Data and interpret all available evidence related to a specific
symbolizes the aspiration to build platforms and tools to research question in a way that is unbiased and (to a degree)
ingest, store and analyze data that can be voluminous, repeatable” [7]. However, we will explicitly explain the
diverse, and possibly fast changing”. procedure of our literature review. Therefore, we have
It is not clear who are professionals and who are laypeople partly adapted two review guidelines: Okoli and
in Big Data era. For example, computer scientists, Schabram[6], Kitchenham and Charters[7].
statisticians, mathematicians, and informatics have Big Data The topic of our review is how Big Data analytics has
capabilities [3]. Usually, laypeople are not interested in been defined. A definition can be described as “a statement
platforms and tools – they are interested in results of expressing the essential nature of something” [8] has further
analyzed data. Furthermore, laypeople might be worry about stated the following way [9]: “Definitions are statements
growing data masses (“90% of the data in the world today describing a concept, and terms are expressions used to
52 Big Data Analytics for Professionals, Data-milling for Laypeople

refer to concepts”. Our review process has the following When we searched “big data analytics” we found 29
steps: articles (Table 3). We appraised the hits and we the desired
1. Specifying the search terms to select papers only from two databases, ACM and IEEE
2. Selecting the databases Xplore ones.
3. Searching for the papers We went through all IEEE Xplore hits
4. Appraising the hits and selecting the papers [10,11,12,13,14,15,16,17,18,19,20,21]. Two ACM hits
5. Citing the definitions from the papers referred to the same IEEE article [17]. Contents of two
We made three delimitations: the articles are fetched in ACM hits are similar, and therefore, we used only one
databases according to Science and mathematical sciences, reference [23]. Finally, we went through only 11 ACM hits
and the terms are searched only from the titles of the papers. [22,23,24,25,26,27,28,29,30,31,32]. Main content of one
We have specified our search terms when we planned our ACM hit [28] is similar to the content of the IEEE article
research strategies. However, we made three experimental [13]. First, we cited the papers to find out the statements
searches. In the first experimental search, three articles were expressing big data analytics. We marked the excluded
found when the search data-milling (Table 1). parts on three dots (. . .) and we commented on direct
quotations as an additional clarification in the brackets ({}).
Table 1. Hits for data-milling
We found the following statements (excluding [18,20,31]):
Database Hits 1. “In order to promptly derive insight from big-data,
enterprises have to deploy big-data analytics into an
IEEE Xplore 0
extraordinarily scalable delivery platform . . . Our MOBB
ACM 0 approach has been designed for data-intensive tasks (e.g.,
big-data analytics) that typically require special platforms
ScienceDirect (Elsevier) 0
such as MapReduce cluster and especially, can run in
SpringerLink 0 parallel” [10]
Web of Science – WoS (ISI) 3
2. “A big data analytics ecosystem built around MapReduce
is emerging alongside the traditional one built around
When the articles were examined more closely, it was RDBMS” [11]
noticed that the titles of articles had been wrongly written 3. “Across disciplines, big data has been attracting
into the Web of Science – WoS (ISI) database, the word significant attention globally from government funding
mining should have been used instead of the word milling. agencies, academia, and industry. The field of AI is no
In the second experimental search, we tried to find out exception, with its particular emphasis on developing
definitions for advanced data analytics. However, we did specialized data mining methods to explore big data, among
not find any. In the third experimental search, we used the other closely related research topics that can be broadly
search term Big Data analytics without quotation marks. We labeled as analytics“ [12]
got 71 hits (Table 2). 4. “Parallel database systems and MapReduce systems
(most notably Hadoop) are essential components of today’s
Table 2. Hits for Big Data analytics infrastructure for Big Data analytics” [13,28]
Database Hits
5. “Big data analytics use compute-intensive data mining
algorithms that require efficient high-performance
IEEE Xplore 31 processors to produce timely results. Cloud computing
ACM 26
infrastructures can serve as an effective platform for
addressing both the computational and data storage needs of
ScienceDirect (Elsevier) 5 big data analytics applications . . . Advanced data mining
SpringerLink 1 techniques and associated tools can help extract information
from large, complex datasets that is useful in making
Web of Science – WoS (ISI) 8 informed decisions in many business and scientific
applications including tax payment collection, market sales,
Table 3. Hits for “Big Data analytics”
social studies, biosciences, and high-energy physics.
Database Hits Selected Combining big data analytics and knowledge discovery
techniques with scalable computing systems will produce
IEEE Xplore 12 12 new insights in a shorter time . . . Developers and
ACM 14 11
researchers can adopt the software as a service (SaaS),
platform as a service (PaaS), and infrastructure as a service
ScienceDirect (Elsevier) 2 0 (IaaS) models to implement big data analytics solutions in
the cloud. The SaaS model {the first SaaS definition:}
SpringerLink 0 0
offers complete big data analytics applications to end users,
Web of Science – WoS who can exploit the cloud’s scalability in both data storage
3 0
(ISI)
and processing power to execute analysis on large or
World Journal of Computer Application and Technology 1(2): 51-57, 2013 53

complex datasets . . . {the second SaaS definition:} insights from data} are largely statistical and combine rich
provides a well-defined data mining algorithm or databases with software driven by statistical analysis and
ready-to-use knowledge discovery tool as an Internet machine learning. Examples include Google’s Knowledge
service to end users, who can access it directly through a Graph, Apple’s Siri, IBM’s Jeopardy-winning Watson
Web browser . . . {the third SaaS definition:} A single and system, and the recommendation systems of Amazon and
complete data mining application or task (including data Netflix. The success of these big-data analytics–driven
sources) offered as a service” [14] systems, also known as trained systems, has captured the
6. “Cloud computing makes data analytics an attractive public imagination, and there is excitement about bringing
preposition for small and medium organisations that need to such capabilities to other applications in enterprises, health
process large datasets and perform fast queries” [15] care, science, and government” [23]
7. “Big Data analytics is a fast growing and influential 13. “Big data analytics has become critical for industries
practice” [16] and organizations to extract useful information from huge
8. “we consider click stream processing (most widely used and chaotic data sets to support their core operations in
case of Big Data analytics). In future, predictive models and many business and scientific applications. Meanwhile, the
feature sets can be identified for other Big Data analytics computing speed of commodity computers and the capacity
workloads/data sets” [17] of storage systems continue to improve while their unit
9. “A key part of big data analytics is the need to collect, prices continue to decrease. Nowadays, it is a common
maintain and analyze enormous amounts of data efficiently. practice to deploy a large scale cluster with commodity
To address these needs, frameworks based on MapReduce computers as nodes for big data analytics” [24]
are used for processing large data-sets using a cluster of 14. “Decision makers of all kinds, from company
machines” [19] executives to government agencies to researchers and
10. “The need to process and analyze such massive datasets scientists, would like to base their decisions and actions on
has introduced a new form of data analytics called Big Data this data. In response, a new discipline of big data analytics
Analytics. Big Data analytics involves analyzing large is forming. Fundamentally, big data analytics is a workflow
amounts of data of a variety of types to uncover hidden that distills terabytes of low-value data (e.g., every tweet)
patterns, unknown correlations and other useful information. down to, in some cases, a single bit of high-value data
Many organizations are increasingly using Big Data (Should Company X acquire Company Y? Can we reject
analytics to get better insights into their businesses, increase the null hypothesis?). The goal is to see the big picture from
their revenue and profitability and gain competitive the minutia of our digital lives . . . The term analytics
advantages over rival organizations . . . Big Data analytics (including its big data form) is often used broadly to cover
platform in today’s world often refers to the Map-Reduce any data-driven decision making. Here, we use the term for
framework . . . Map-Reduce framework provides a two groups: corporate analytics teams and academic
programming model using “map” and “reduce” functions research scientists. In the corporate world, an analytics team
over key-value pairs that can be executed in parallel on a uses their expertise in statistics, data mining, machine
large cluster of compute nodes . . . The other key aspect of learning, and visualization to answer questions that
Big Data analytics is to push the computation near the data. corporate leaders pose. They draw on data from corporate
Generally, in a Map-Reduce environment, the compute and sources (e.g., customer, sales, or product-usage data) called
storage nodes are the same, i.e. the computational tasks run business information, sometimes in combination with data
on the same set of nodes that hold the data required for the from public sources interactions (e.g. tweets or
computations” [21] demographics) . . . In the academic world, research
11. “In the age of big data, businesses compete in extracting scientists analyze data to test hypotheses and form theories.
the most information out of the immense amount of data Though there are undeniable differences with corporate
they acquire. Since more information translates almost analytics (e.g., scientists typically choose their own research
directly into better decisions that provide a much questions, exercise more control over the source data, and
sought-after competitive edge, big data analytics tools report results to knowledgeable peers), the overall analysis
promising to deliver this additional bit of information are workflow is often similar . . . today’s big data analytics is a
highly-valued. There are two major issues that have to be throwback to an earlier age of mainframe computing . . . as
addressed by any such tool. First, they have to cope with an emerging type of knowledge work” [25]
massive amounts of data . . . Second, the tools have to be 15. “In order to extract value out of the data, the analysts
general and extensible. They have to provide a large need to apply a variety of methods {advanced analytical
spectrum of data analysis methods ranging from simple methods} ranging from statistics to machine learning and
descriptive statistics to complex predictive models. beyond” [26]
Moreover, the tools should be easily extensible with new 16. “A big data environment presents both a great
methods without major code development” [22] opportunity and a challenge due to the explosion and
12. Many state-of-the-art approaches to both of these heterogeneity of the potential data sources that extend the
challenges {more and more data comes in diverse forms, boundary of analytics to social networks, real time streams
the proliferation of ever-evolving algorithms to gain and other forms of highly contextual data that is
54 Big Data Analytics for Professionals, Data-milling for Laypeople

characterized by high volume and speed” [27] The computing world goes towards to the ongoing cycle
17. “the important aspects of “big data” analytics: of data-milling (Figure 1). We get information nuggets, even
without will, and our reactions depends on us.
 Big: the vast volumes and fast growth of datasets,
requiring cost-effective storage (e.g., HDDs) and scalable
solutions (e.g., scale-out architetures);
 Fast: the need for low-latency data analytics that can
keep pace with business decisions;
 Total: the trend toward integration and correlation of
multiple, potentially heterogeneous, data sources;
 Deep: the use of sophisticated analytics algorithms (e.g.,
machine learning and statistical analysis);
 Fresh: the need for near real-time integration as well as
analytics on recently generated data.” [29]
18. “Big data analytics is the process of examining large
amounts of data (big data) in an effort to uncover hidden
patterns or unknown correlations. Big Data Analytics
Applications (BDA Apps) are a new type of software
applications, which analyze big data using massive parallel
processing frameworks (e.g., Hadoop)” [30]
19. “Today’s data explosion, fueled by emerging Figure 1. Data-milling.
applications, such as social networking, micro blogs, and
the “crowd intelligence” capabilities of many sites, has led Information nuggets reveal different meanings for
to the “big data” phenomenon. It is characterized by laypeople. Even one nugget can make deeper understanding
increasing volumes of data of disparate types (i.e., and lead reactions (e.g. does something by her or give
structured, semi-structured and unstructured) from sources assignment) or it is just "nice to know" and does not lead any
that generate new data at a high rate (e.g., click streams reaction. It is already fact that not hidden data enables
captured in web server logs). This wealth of data provides innovations. All depends on what the laypeople invent to do
numerous new analytic and business intelligence with the information nuggets from heterogeneous data.
opportunities like fraud detection, customer profiling, and Furthermore, when the information nuggets are available, it
churn and customer loyalty analysis. Consequently, there is will be more difficult to present the throws and claims
tremendous interest in academia and industry to address the without grounds.
challenges in storing, accessing and analyzing this data.
Several commercial and open source providers already
unleashed a variety of products to support big data storage
and processing” [32]
The statements illustrate that big data analytics is
ambiguous. However, the following statements can be taken
to crystallize it:
– “Big Data analytics involves analyzing large amounts of
data of a variety of types to uncover hidden patterns,
unknown correlations and other useful information” [21]
–“Big data analytics has become critical for industries and
organizations to extract useful information from huge and
chaotic data sets to support their core operations in many
business and scientific applications” [24]
–“big data analytics is a workflow that distills terabytes of
low-value data . . . down to, in some cases, a single bit of
high-value data . . . The goal is to see the big picture from
the minutia of our digital lives” [25]
–“Big data analytics is the process of examining large
amounts of data (big data) in an effort to uncover hidden
patterns or unknown correlations” [30] Figure 2. Key-questions of data-milling (adapted from Eckerson[34])

Data-milling will find a lot of information nuggets which


help us to form an opinion or to decide the matter. It is not
3. Data-milling necessarily to have predefined questions for data-milling.
World Journal of Computer Application and Technology 1(2): 51-57, 2013 55

However, it is important even for laypeople to understand all, there are a lot of potential indicators [38,39], not to
complexity and possible value of data-milling (Figure 2). mention, there is a vast amount of data that is not used in
The key-questions are derived partly descriptive and analytics or as a data source for indicators. Such data could
inferential analytics and partly from the simple taxonomy of contain vital information about organizations (e.g. products,
business analytics which is divided into three categories processes, customers, competitors, and partners), and
[33]: descriptive analytics uses the data to answer the market trends. We started to talk about data-milling for
questions the questions concerning the past and the present, providing information nuggets for indicators instead of
predictive analytics answers the questions concerning the unfamiliar big data analytics.
future, and prescriptive analytics answers the questions, We illustrated data-milling implicitly by business
what should be done and why. intelligence and strategic management for better competitive
It is important for laypeople to understand that they do not advantage (Figure 4). Actually, the business intelligence
have to understand even complex statistical things (Figure 3). layer contains, for example, both descriptive statistics and
First, we can use descriptive statistics to present some facts inferential statistics.
based on information nuggets which are categorical or
numerical. If it seems to be worth for laypeople to use
professionals for inferential statistics, they both have some
kind of common sense about "what may be calculated" and
"what is worth calculating". There will be no mind in data
analytics for laypeople if they do not have basic know-how
from the interpretations of the results of the data analyses (i.e.
what have be calculated and why). Laypeople may need
professionals to do descriptive statistics. However,
professionals are used for inferential statistics, as well as, to
do data-milling (i.e. assignments are made).

Figure 4. Data-milling is a part of competitive advantage

There are miscellaneous data sources in Figure 4.


Furthermore, there are even miscellaneous indicators (Table
4) adapted from Marr[40] and those are selected especially
for our example case. The indicator called market growth
rate shows if the market is growing or shrinking. This is a
good indicator for predicting the future. The indicator called
relative market share shows how well we are developing our
market share compared with our competitors. The indicator
Figure 3. Key-questions for inferential statistics called carbon footprint is used to sum the direct emission of
the greenhouse gases from the burning of fossil fuels for
Nowadays, there are professionals that have deep analytic
energy consumption and transportation. Furthermore, this
skills [35]. When they examine Big Data, they have to
indicator effects directly to the politics and the politics has
have, first of all, know-how from algorithms, because data
effects against or favor investment decision for coal power
mining is needed and it “is about applying algorithms to
plants. The indicator called energy consumption explains
data, rather than using data to “train” a machine-learning
coal power’s market share of the energy market. The
engine of some sort” [36]. The need for data mining can be
indicator called savings levels due to conservation and
crystallized within Hiltunen’s[37] clause “it is possible to
improvement efforts is one of the important technological
analyze all the qualitative data in quantitative form by using
challenges for the coal power plants and indirectly for the
text and data mining tools”.
investment decisions. The indicator called waste
consumption rate is a favorite indicator for the investment
4. Discussion through Coal Power decision. It measures the coal power plant.
When we have information nuggets for the selected
When we tried to find information nuggets for indicators indicators, we are going to use descriptive statistics to find
for investments in the coal power plants in Europe until the out unfamiliar facts based on information nuggets. We
year 2020, we realized that we need a lot of data, for assume, for example, that our set of indicators can be
example, from social media, TV, news, and politics. First of changed. We believe that we will find uncover hidden
56 Big Data Analytics for Professionals, Data-milling for Laypeople

patterns, unknown correlations and other useful information. REFERENCES


Table 4. Indicators for investments in the coal power plan [1] D.E. O’Leary. Artificial Intelligence and Big Data. IEEE
Computer Society, 96-99, 2013.
Indicator Description Source
[2] S. Chaudhuri.How Different id Big Data? IEEE 28th
To what extent are we
Market growth Available market International Conference on Data Engineering, 5, 2012.
operating in markets with
rate research data
future potential? [3] H. Topi. Where is Big Data in Your Information Systems
How well are we developing Curriculum? acmInroads, Vol. 4. No.1, 12-13, 2013.
Relative market our market share in
Annual reports
share comparison to our [4] IBM, Big Data at the Speed of Business, What is big data,
competitors? Online available from http://www-01.ibm.com/software/data
How well do we safeguard Scientific /bigdata/
the environment in the journals/research
Carbon footprint [5] S. Alsubaiee, Y. Altowim, H. Altwaijry, A. Behm, V. Borkar,
execution of our business general values for the
operations? coal plant power Y. Bu, M. Carey, R. Grover, Z. Heilbron, Y.-S. Kim, C. Li, N.
What is the energy Onose, P. Pirzadeh, R. Vernica, J. Wen. ASTERIX: An Open
Energy Energy companies Source System for “Big Data” Management and Analysis
consumption produced by
consumption annual reports (Demo). Proceedings of the VLDB Endowment, Vol 5, No.
coal power?
Savings levels
12, 1898-1901, 2012.
due to To what extent are we Total level of savings [6] C. Okoli, K. Schabram. A Guide to Conducting a Systematic
conservation actively reducing the (in carbon emission, Literature Review of Information Systems Research. Sprouts:
and environmental impact of our water usage, energy Working Papers on Information Systems, 2010.
improvement business? usage or cost)
efforts [7] B. Kitchenham, S. Charters. Guidelines for performing
To what extent are we Systematic Literature Reviews in Software Engineering.
Waste
recovering our waste for EBSE Technical Report EBSE-2007-01, 2007.
consumption Energy statistics
reuse or recycling for the
rate [8] Merriam-Webster. Online available from http://www.merria
energy production?
mwebster.com/dictionary/definition
[9] H. Suonuuti. Guide to Terminology, 2nd edition ed.
5. Conclusion Tekniikan sanastokeskus ry, Helsinki, 2001.

There are a lot of heterogeneous data and it might be [10] G. Jung, N. Gnanasambandam, T. Mukherjee. Synchronous
openly available. For example, public sector, mainly at the Parallel Processing of Big-Data analytics Services to
Optimize Performance in Federated Clouds. IEEE 5th
governmental level (e.g. the United States and Britain), has
International Conference on Cloud Computing (CLOUD),
been made data available for free for anyone to use – the 811-818, 2012.
“openness of data means in practice that data has been made
as easy as possible for anyone to use” [41]. [11] X. Qin, H. Wang, F. Li, B. Zhou, Y. Cao, C. Li, H. Chen, X.
Zhou, X. Du,, S. Wang. Beyond Simple Integration of
In this article, we launched the term data-milling to
RDBMS and MapReduce -- Paving the Way toward a Unified
represent the searching of the information nuggets from the System for Big Data analytics: Vision and Progress. Second
heterogeneous data. To justify the launched term International Conference on Cloud and Green Computing
data-milling, we made the literature review in which we (CGC), 716-725, 2012.
searched the definitions of Big Data analytics. Our review [12] D. Zeng, R. Lusch. Big Data Analytics: Perspective Shifting
showed that Big Data analytics is verbosely explained. We from Transactions to Ecosystems. Intelligent Systems, IEEE,
used only four statements from 19 to crystallize Big Data Volume 28, Issue 2, 2-5, 2013.
analytics.
[13] A. Aboulnaga, S. Babu. Workload management for Big Data
Our research strategy was partly descriptive and partly analytics. IEEE 29th International Conference on Data
improving. Our literature review of Big Data analytics gave Engineering (ICDE), 1249, 2013.
the description of current status of the phenomenon Big Data.
The launched term data-milling improves the understanding [14] D. Talia. Clouds for Scalable Big Data Analytics. Computer,
Volume 46, Issue 5, 98-101, 2013.
of the phenomenon Big Data, as well as, possibilities of data
analytics. However, explanatory research strategy and [15] A. Nazir, Y.M. Yassin, C.P. Kit, E.K. Karuppiah. Evaluation
exploratory research strategy illustrate the reason for of virtual machine scalability on distributed multi/many-core
data-milling appositely, i.e. seek an explanation for a processors for big data analytics. IEEE Conference on Open
Systems (ICOS), 1-6, 2012.
situation or a problem, try to find out what is happening,
seeks new insights and generates new ideas and hypotheses [16] S. Singh, N. Singh. Big Data analytics. International
for future research [42]. Conference on Communication, Information & Computing
Technology (ICCICT), 1-4, 2012.
[17] R.T. Kaushik, K. Nahrstedt. T*: A data-centric cooling
energy costs reduction approach for Big Data analytics cloud.
World Journal of Computer Application and Technology 1(2): 51-57, 2013 57

International Conference for High Performance Computing, [29] J. Chang, K.T. Lim, J. Byrne, L. Ramirez, P. Ranganathan.
Networking, Storage and Analysis (SC), 11 pages, 2012. Workload diversity and dynamics in big data analytics:
implications to system designers. ASBD '12: Proceedings of
[18] Y. Simmhan, V. Prasanna, S. Aman, A. Kumbhare, R. Liu, S. the 2nd Workshop on Architectures and Systems for Big Data,
Stevens, Q. Zhao. Cloud-Based Software Platform For Big 21-26, 2012.
Data Analytics In Smart Grids. Accepted for publication in
Computing in Science & Engineering, IEEE, 2013. [30] W. Shang, Z.M. Jiang, H. Hemmati, B. Adams, A.E. Hassan,
P. Martin. Assisting developers of big data analytics
[19] N. Laptev, K. Zeng, C. Zaniolo. Very fast estimation for applications when deploying on hadoop clouds. ICSE '13:
result and accuracy of big data analytics: The EARL system. Proceedings of the 2013 International Conference on
IEEE 29th International Conference on Data Engineering Software Engineering, 402-411, 2013.
(ICDE), 1296-1299, 2013.
[31] A. Bhambhri. Six tips for students interested in big data
[20] G. Sijie, X. Jin, W. Weiping, L. Rubao. Mastiff: A analytics. XRDS: Crossroads, The ACM Magazine for
MapReduce-based System for Time-Based Big Data Students, Volume 19, Issue 1, 9, 2012. (19.)
Analytics. IEEE International Conference on Cluster
Computing (CLUSTER), 72-80, 2012. [32] A. Ghazal, T. Rabl, M. Hu, F. Raab, M. Poess, A. Crolotte,
H.-A. Jacobsen. BigBench: towards an industry standard
[21] A. Mukherjee, J. Datta, R. Jorapur, R. Singhvi, S. Haloi, W. benchmark for big data analytics. SIGMOD '13: Proceedings
Akram. Shared disk big data analytics with Apache Hadoop. of the 2013 international conference on Management of data,
19th International Conference on High Performance 1197-1208, 2013.
Computing (HiPC), 2012.
[33] D. Delen, H. Demirkan. Data, information and analytics as
[22] C. Qin, F. Rusu. Scalable I/O-bound parallel incremental services. Decision Support Systems, 55, 359-363, 2013.
gradient descent for big data analytics in GLADE. DanaC '13:
Proceedings of the Second Workshop on Data Analytics in [34] W. Eckerson. Predictive Analytics Extending the Value of
the Cloud, 16-20, 2013. Your Data Warehousing Investment, First quarter 2007
TDWI best practices report, 2007.
[23] A. Kumar, F. Niu, C. Ré. Hazy: Making It Easier to Build and
Maintain Big-Data Analytics. acmqueue-magazine - Web [35] McKinsey & Company. Big data: The next frontier for
Development, Volume 11, Issue 1, 1-17, January 2013. competition. Online available from http://www.mckinsey.co
Communications of the ACM , Volume 56, Issue 3, 40-49, m/features/big_data
2013.
[36] A. Rajaraman, J. Leskovec, J. D. Ullman. Mining of Massive
[24] Y. Huai, R. Lee, S. Zhang, C.H. Xia, X. Zhang. DOT: A Datasets. 2013. Online available from http://i.stanford.edu/~
Matrix Model for Analyzing, Optimizing and Deploying ullman/mmds/book.pdf
Software for Big Data Analytics in Distributed Systems.
SOCC '11: Proceedings of the 2nd ACM Symposium on [37] E. Hiltunen. Weak Signals in Organizational Futures. Aalto
Cloud Computing, 14 pages, 2011. University, 2012.

[25] D. Fisher, R. DeLine, M. Czerwinski, S. Drucker. Interactions [38] R. Baroudi, KPI Mega Library: 17,000 Key Performance
with big data analytics. interactions, Volume 19 Issue 3, 2012. Indicators. 2010.

[26] Y. Cheng, C. Qin, F. Rusu. GLADE: big data analytics made [39] European Commission, Europe 2020 indicators, Headline
easy. SIGMOD '12: Proceedings of the 2012 ACM SIGMOD indicators Online available from http://epp.eurostat.ec.europ
International Conference on Management of Data, 697-700, a.eu/portal/page/portal/europe_2020_indicators/headline_ind
2012. icators

[27] R. Bhatti, R. LaSalle, R. Bird, T. Grance, E. Bertino. [40] B. Marr, Key Performance Indicators, The 75 measures every
Emerging trends around big data analytics and security: panel. manager needs to know. Pearson education limited, 2012
SACMAT '12: Proceedings of the 17th ACM symposium on
Access Control Models and Technologie, 67-68, 2012. [41] Helsinki region infoshare, Open data. Online available from
http://www.hri.fi/en/about/open-data/
[28] A. Aboulnaga, S. Babu. Workload management for Big Data
analytics. SIGMOD '13: Proceedings of the 2013 [42] P. Runeson, M. Host, A. Rainer, B. Regnell. Case Study
international conference on Management of data, 929-931, Research in Software Engineering: Guidelines and Examples.
2013. Hoboken, New Jersey: John Wiley & Sons, Inc., 2012
Paper II
Authors: Virpi Hotti, Ulla Gain

Article title: Exploitation and exploration underpin business and insights underpin business analytics. Journal:
Communications in Computer and Information Science, 636:223-237,

Publisher: Springer, Cham

Permissions from co-authors via email: Hotti Virpi received 9.11.2021 at 7:57

Reproduced with permission from Sringer Nature

Journal permissions, Bob Adegboyega, Permission Assistant, Springer Nature:


Exploitation and Exploration Underpin
Business and Insights Underpin Business
Analytics

Virpi Hotti(&) and Ulla Gain

School of Computing, University of Eastern Finland, Kuopio, Finland


{Virpi.Hotti,gain}@uef.fi

Abstract. The revolutionary development in the cognitive computing is con-


nected with natural language processing. For example, the IBM Watson Per-
sonality Insights service is research-based and IBM’s intuition is that writing
always reflects the author’s personality. The IBM Watson Personality Insights
service provides a list of the behaviors that the personality is likely (e.g., treat
yourself) or unlikely (e.g., put health at risk) to manifest. However, the use-
fulness of the objective insights has to figure out in the business context in
where the organizations have to perform and conform they duties. Furthermore,
the organizations have to predict the future outcomes within several business
analytics. In this paper, the ideas around organizational ambidexterity (i.e.,
exploitation and exploration) are used to clarify the meaning of the objective
insights. The objective insights increase the behavior-centric value propositions,
as well as, decrease the number of stakeholder-centric business analytics.

Keywords: Exploitation  Exploration  Insights  Principles

1 Introduction

The insight-driven organizations [1] are increasing in future, as well as, insights as
services. The insights are generated for further usage, for example, to cover unknown
desires or needs (i.e., the human being does not know yet that his desires and needs are
formed by this experiences and they have left their marks on his linguistic expressions).
When we have insights from the human being, then we will offer him the first of all
experiences within products and services the suitability of which corresponds within
the needs and desires (or values) of the human being. When we want to be human- or
behavior-centric, then we have to learn to question by the mouth of the human being as
follows [2]:
• Advise me (i.e., bring expertise to interactions)
• Alert me (i.e., personalize communication within real-time predictive analytics)
• Ask me (i.e., consult on products, services, and social issues)
• Compare me (i.e., offer peer analytics on virtual channel)
• Educate me (i.e., offer digital online and give tips “in the moment”)
• Excite me (i.e., offer unexpected services at unexpected moments)

© Springer International Publishing Switzerland 2016


H. Li et al. (Eds.): WIS 2016, CCIS 636, pp. 223–237, 2016.
DOI: 10.1007/978-3-319-44672-1_18
224 V. Hotti and U. Gain

• Find me (i.e., use visualization and analytics to discover segments)


• Grow with me (i.e., connect data and insights the lives and households)
• Know me (i.e., offer new products and services based on understanding desires and
needs)
• Let me choose (i.e., offer optional versus prerequisites, roadmaps versus
checkboxes)
• Protect me (i.e., offer multifactor security)
• Trade with me (i.e., give in return better products and services based on sharing
data, location, and new ideas)
In the dataism era (or data-ism [3]), the world is controlled by the conceptualization
in where the concept is defined “abstract entity for determining category membership”
[4]. For example, the personality is under conceptualization by cognitive computing.
The IBM Watson Personality Insights service [5] applies “linguistic analytics and
personality theory to infer traits” [6] from text (e.g., from social media, enterprise data,
or other digital communications [7]) - “IBM’s intuition is that writing always reflects
the author’s personality, regardless of the subject matter” [8]. The IBM Watson Per-
sonality Insights service uses a corpus of words that reflect the high or low values of
particular characteristics [9] at four levels of strength [10]: weak (100 * 1500 words),
decent (1500 * 3500 words), strong (3500 * 6000 words) and very strong
(6000 + words). The IBM Watson Personality Insights service enables the objective
insights of the human beings. The utilizations of the personality insights may differ
from self-study to personalized services such as product recommendations [10],
matching individuals such as doctor-patient matching because patients prefer doctors
who are similar to themselves, monitoring and predicting mental health such as pre-
dicting postpartum and other forms of depression from social media, monitoring radical
and rogue elements via social media [11]. There are some applications (e.g., Celebrity
Match [12] and Investment Advisor [13]) in where the IBM Watson Personality
Insights are meaningful part.
One way to bring the insights closer to the experiments in business is to use the
same concepts as in business. Nowadays, the evidence-based decision making is
required. ISO 9000:2015 has the principle for the evidence-based decision making and
the statement of the principle is the follows [14]: “Decisions based on the analysis and
evaluation of data and information are more likely to produce desired results”. In this
paper, the analysis of data and information refers to the business analytics the meaning
of which is to predict the outcomes. The evaluation of data and information refers to the
business intelligence the meaning of which is to judge the performance. Furthermore,
within the business analytics and business intelligence the ideas around organizational
ambidexterity (i.e., exploitation and exploration) have been adapted both to clarify the
meaning of the business analytics and the objective insights.
The “original meaning of ambidexterity was an individual’s capacity to be equally
skillful with both hands” [15]. In 1976, Duncan defines organizational ambidexterity to
be the ability of an organization to balance short- and long-term objectives. At beginning
of the 90’s, March replaces short- and long-term objectives within exploitation and
exploration. Bøe-Lillegraven crystallizes previous ambidexterity studies as follows [16]:
“exploration is linked to growth whereas exploitation is linked to profits”. There are a lot
Exploitation and Exploration Underpin Business 225

of articles written about organizational ambidexterity. However, there was only one
article [16] in where ambidexterity is combined within analytics (Google Scholar,
intitle:ambidexterity + intitle:analytics) and it does not handle either business analytics
or insights based on cognitive computing. In this paper, the adaption of the organiza-
tional ambidexterity is novelty - ambidexterity is combined within business analytics
and the objective insights such as the personality ones.
The main aim of this paper is to encourage for experiments around the behavior-
centric value proposition based on the objective insights. Therefore, we clarify our
ambidexterity adaption within the data- and principle-based extractions and the
building blocks of the business models, as well as, we mapped some business analytics
within exploitation and exploration (Sect. 2). Furthermore, we clarity whether there are
explicit explanations for the personality traits of the IBM Watson Personality Insights
service and how we can use them for the planning value proposition (Sect. 3).

2 Ambidexterity and Business

Nowadays data is ennobled for the insights that can be equated within the principles. If
we understand the patterns in the data, then it is possible to understand principles [17],
and vice versa. There have to be two-way transparency between selected data (i.e.,
capta “which is taken in analysis” [18]) and principles (Fig. 1). Data have to process
from raw data to principles and the principles have to have identified mechanisms (i.e.,
metrics the sources of which are datasets) that will be used to measure whether the
principle has been met or not. By setting metrics for monitor performance, organiza-
tions can capture timely information to help drive organizational performance.

Fig. 1. Data-based and principle-based extractions

When the strategic goals are clear, this means measurable objectives the target of
which are set. TOGAF [19] defines the objectives, as time-bounded milestones for
enterprises used to demonstrate progress towards goals tracked against measures (i.e.,
indicators or factors). In generally, the governing body (the “person or group of people
who are accountable for the performance and conformance of the organization” [20]) has
to understand in where the organizations have to perform and conform they duties, as
well as, the governing body familiar with the predictable outcomes within several
business analytics. Hence, for example, Bernard Marr has published the world-wide
known publications to clarify key questions for analytics and performance [21], key
226 V. Hotti and U. Gain

performance indicators [22], key business analytics [23], and even the construction for
strategy board having six panels in where the panel-specific questions are as follows [24]:
• The Purpose Panel - why your business exists, and what you want your business to
be in the future?
• The Customer Panel - how much you know about the customers, and what you may
need to find out in order to deliver on your strategic objective?
• The Finance Panel - how does your strategy generate money, and are you confident
your business model is accurate?
• The Operations Panel - what you need to do internally to deliver your strategy, and
what core competencies will you need to excel if you are going to execute your
chosen strategy?
• The Resource Panel - what resources you need to deliver your strategy and what
you may need to find out?
• The Competition and Risk Panel – what is threatening your success?
Instead of panes or canvases the ideas around organizational ambidexterity are used
in this paper both clarifying insight-based business and encouraging for experiments
around, for example, new technology. Organizational ambidexterity refers to compete
both in mature and new markets and technologies – the mature one is labelled with
efficiency, control and incremental improvement; the new one is labelled with flexi-
bility, autonomy, and experimentation [25]. Furthermore, there are some alignments
(e.g., strategic intent, critical tasks, competences, structure, controls and rewards,
culture and leadership role [26]) that have been taken into consideration when
exploitation and exploration have been compared.
Business models are described either within building blocks of Business Model
Canvas (BMC) or Lean Canvas. The BMC building blocks as follows [27]: Key
partners, Key resources, Value proposition, Customer Relationship, Customer segment,
Distribution channel, Revenue stream. However, Lean Canvas replaces five building
blocks he following way [28]: Key Partners are Problem, Key activities are Solution,
Key resources are Key Metrics, Value Proposition is Unfair Advantage, and Customer
Relationship is Unfair Advantage. It is obvious that Business Model Canvas supports
mainly exploitation and Lean Canvas supports mainly exploration. However,
exploitation and exploration are business models having several interaction or inte-
gration mechanisms. For example, Bøe-Lillegraven constructs six dimensions for
explore and exploit value chains in where activities can be partly the same.
Bøe-Lillegraven illustrates the interaction mechanisms between the following six
dimensions: resource allocation, cost structure, value proposition, market performance,
revenues, and profits. The resource allocation is the first dimension of the value chains
and it seems to be challenges for leaders because they have to be “able to orchestrate
the allocation of resources between the old and new business domains” [25].
Nowadays, alternative calculations are made around available resources, cost
structure, revenues, and profits. Therefore, we do not take optimization things into the
consideration, when we want to find out in where the insights based on cognitive
computing reduce the number of needed business analytics. When we map different
business analytics within exploitation and exploration we used two dimensions (i.e.,
value proposition and market performance) of the multi-dimensional conceptual
Exploitation and Exploration Underpin Business 227

Table 1. Mapping a set of business analytics


Business Reason for business Exploitation Exploration Value Market
analytics analytics proposition performance
Customer Finding money making x x x
profitability customers
Product Finding money making x x x
profitability products
Value driver Clarifying value drivers x x
Non-customer Finding new x x
opportunities
Customer Estimating impacts on x x x
engagement the customer experience
Customer Increasing revenue by x x
segmentation meeting needs
Customer Finding problems in the x x
acquisition buying process
Marketing Where and when x x
channel prospects and customers
are reachable?

framework of explore and exploit value chains by Bøe-Lillegraven. Therefore, the


selected business analytics [23] concern mainly customers and market (Table 1).
Furthermore, if we found the corresponding performance indicators, then we
mapped those within market performance. When we made our mappings, we realized
that the most of the business analytics concerns the different stakeholders (e.g., cus-
tomers, employees and shareholders). Therefore, we exemplified the questions in where
the personality insights are in centric (Table 2).

Table 2. Mapping personality insights for business analytics


Business analytic Trait-based question started by “What are the
personality insights”
Customer profitability … of the founded money making customers?
Product profitability … of the buyers of the founded money making
products?
Value driver … of the most important stakeholders?
Unmet need, Customer segmentation, … of the customers?
Customer engagement, Customer
acquisition
Non-customer … of the prospects?
Marketing channel … of the customers per marketing channel?
228 V. Hotti and U. Gain

3 Personality Insights

The IBM Watson Personality Insights service provides [10] a personality scoreboard in
where the traits are grouped into the personality insights of three kind (i.e., personality,
needs and values) and “each trait value is a percentile score which compares the
individual to a broader population”. The service provides a list of the behaviors that the
personality is likely (e.g., treat yourself, click on an ad, follow on social media) or
unlikely (put health at risk, re-share on social media, take financial risks) to manifest.
The behaviors are based on studies ``which reveal different correlations between per-
sonality traits and behaviors in certain industries or domains'' such as follows [10]:
spending habits are related to the emotional range, risk profiles to extraversion, and
healthy decisions are related to conscientiousness.
“IBM tends to believe that personality evolves within certain bounds, it has con-
ducted no studies to examine the upper and lower bounds of this evolution” [10].
However, IBM gives the following recommendations [10]: work with the latest data
and with as much available data as possible, refresh personality portraits at regular
intervals to capture individuals’ evolving personalities. Furthermore, the Personality
Insights service is evolving. For example, we tested within the IBM Watson Personality
Insights service [6] the same text (Fig. 2) twice.
There is the research literature [29] behind the words and they have been validated
by testing with users. The get personality traits are divided into three groups as
follows [30]:

Fig. 2. Example text for the IBM Watson personality insights service
Exploitation and Exploration Underpin Business 229

• Big Five describe “how a person engages with the world”. The model includes the
following five primary dimensions (Tables 3, 4, 5, 6 and 7) the six facets of which
further characterize an individual according to the dimension.
• Values describe motivating factors that influence a decision making. The model
includes five dimensions of human values as follows: Self-transcendence/Helping
others, Conservation/Tradition, Hedonism/Taking pleasure in life, Self-enhancement/
Achieving success, Open to change/Excitement.
• Needs describe which aspects of a product will resonate with an individual. The
model includes twelve characteristic needs as follows: Excitement, Harmony,
Curiosity, Ideal, Closeness, Self-expression, Liberty, Love, Practicality, Stability,
Challenge, and Structure.
We got partly different summaries (Fig. 3) and there were only two facets, Intellect
and Authority-challenging, have decreased one percent from 100 to 99 (Fig. 4). There
are explicitly explanations for the Big Five sentences [31]. However, we did not find
out explanations for the sentences of Needs and Values. Furthermore, there are a lot of
properties [32] for primary and secondary dimensions without explanations for the
sentences (e.g., “You are shrewd and inner-directed”) based on those properties of the
summaries.
There are some traits which characterize that the personality can predict some
outcomes. For example, conscientiousness predicts job performance and extraversion

Table 3. Facets of openness (openness is “the extent to which an individual is open to


experiencing a variety of activities”)
Facet High explanation Low explanation
Adventurousness Adventurous: “You are eager to Consistent: “You enjoy familiar
experience new things” routines and prefer not to deviate
from them”
Artistic interests Appreciative of art: “You enjoy Unconcerned with art: “You are
beauty and seek out creative less concerned with artistic or
experiences” creative activities than most
people”
Emotionality Emotionally aware: “You are Dispassionate: “You do not
aware of your feelings and how frequently think about or openly
to express them” express your emotions”
Imagination Imaginative: “You have a wild Down-to-earth: “You prefer facts
imagination” over fantasy”
Intellect Philosophical: “You are open to Concrete: “You prefer dealing
and intrigued by new ideas and with the world as it is, rarely
love to explore them” considering abstract ideas”
Authority-challenging Authority-challenging: “You Respectful of authority: “You
(i.e., liberalism) prefer to challenge authority and prefer following with tradition to
traditional values to effect maintain a sense of stability”
change”
230 V. Hotti and U. Gain

Table 4. Facets of conscientiousness (conscientiousness is a “tendency to act in an organized or


thoughtful way”)
Facet High explanation Low explanation
Achievement-striving Driven: “You set high goals for Consistent: “You enjoy familiar
yourself and work hard to achieve routines and prefer not to deviate
them” from them”
Cautiousness Deliberate: “You carefully think Bold: “You would rather take
through decisions before making action immediately than spend time
them” deliberating making a decision”
Dutifulness Dutiful: “You take rules and Carefree: “You do what you want,
obligations seriously, even when disregarding rules and obligations”
they are inconvenient”
Orderliness Organized: “You feel a strong Unstructured: “You do not make a
need for structure in your life” lot of time for organization in your
daily life”
Self-discipline Persistent: “You can tackle and Intermittent: “You have a hard time
stick with tough tasks” sticking with difficult tasks for a
long period of time”
Self-efficacy Self-assured: “You feel you have Self-doubting: “You frequently
the ability to succeed in the tasks doubt your ability to achieve your
you set out to do” goals”

Table 5. Facets of extraversion (extraversion is “a tendency to seek stimulation in the company


of others”)
Facet High explanation Low explanation
Activity level Energetic: “You enjoy a fast-paced, Laid-back: “You appreciate a
busy schedule with many activities” relaxed pace in life”
Assertiveness Assertive: “You tend to speak up and Demure: “You prefer to
take charge of situations, and you are listen than to talk, especially
comfortable leading groups” in group situations”
Cheerfulness Cheerful: “You are a joyful person Solemn: “You are generally
and share that joy with the world” serious and do not joke
much”
Excitement-seeking Excitement-seeking: “You are Calm-seeking: “You prefer
excited by taking risks and feel bored activities that are quiet, calm,
without lots of action going on” and safe”
Outgoing (i.e., Outgoing: “You make friends easily Reserved: “You are a private
friendliness) and feel comfortable around other person and do not let many
people” people in”
Gregariousness Sociable: “You enjoy being in the Independent: “You have a
company of others” strong desire to have time to
yourself”
Exploitation and Exploration Underpin Business 231

Table 6. Facets of agreeableness (agreeableness is a “tendency to be compassionate and


cooperative toward others”)
Facet Explanation of summary for high Explanation of summary for low
value value
Altruism Altruistic: “You feel fulfilled Self-focused: “You are more
when helping others and will go concerned with taking care of
out of your way to do so” yourself than taking time for
others”
Cooperation Accommodating: “You are easy Contrary: “You do not shy away
to please and try to avoid from contradicting others”
confrontation”
Modesty Modest: “You are uncomfortable Proud: “You hold yourself in high
being the center of attention” regard and are satisfied with who
you are”
Uncompromising Uncompromising: “You think it Compromising: “You are
(i.e., morality) is wrong to take advantage of comfortable using every trick in
others to get ahead” the book to get what you want”
Sympathy Empathetic: “You feel what Hard-hearted: “You think people
others feel and are compassionate should generally rely more on
toward them” themselves than on others”
Trust Trusting of others: “You believe Cautious of others: “You are wary
the best in others and trust people of other people’s intentions and do
easily” not trust easily”

Table 7. Facets of emotional range (emotional range is “the extent to which emotions are
sensitive to the environment”)
Facet High explanation Low explanation
Fiery (i.e., anger) Fiery: “You have a fiery Mild-tempered: “It takes a lot to get
temper, especially when things you angry”
do not go your way”
Prone to worry Prone to worry: “You tend to Self-assured: “You tend to feel
(i.e., anxiety) worry about things that might calm and self-assured”
happen”
Melancholy (i.e., Melancholy: “You think quite Content: “You are generally
depression) often about the things you are comfortable with yourself as you
unhappy about” are”
Impulsiveness Hedonistic: “You feel your Self-controlled: “You have control
(i.e., desires strongly and are easily over your desires, which are not
immoderation) tempted by them” particularly intense”
Self-consciousness Self-conscious: “You are Confident: “You are hard to
sensitive about what others embarrass and are self-confident
might be thinking of you” most of the time”
Susceptible to Susceptible to stress: “You are Calm under pressure: “You handle
stress (i.e., easily overwhelmed in stressful unexpected events calmly and
vulnerability) situations” effectively”
232 V. Hotti and U. Gain

Fig. 3. Examples of summaries

Fig. 4. Examples of sunbursts


Exploitation and Exploration Underpin Business 233

Table 8. Examples of mappings between primary characteristics and outcomes


Openness Conscien- Emotional Agreeableness Extraversion Outcome
tiousness range
High Try new thing
High Respond to product
respond
High Not abuse credits cards
Low Abuse credits cards
Low Low Avoid to take risks
High Take risks
High High Self-improvement
High High High Greater life expectancy
High Consume high-fat food
High Consume low-fat food
High Try varied diet
High Participate religious
practices

indicates job satisfaction. However, we collected 12 outcomes [30] (Table 8) which


can be explicitly relate within primary characteristics. Actually, the primary charac-
teristics are called social propensities in the Tone Analyzer service [33].
The Celebrity Match service in where the personality traits (Table 9) are adapted in
the way that might decreases the credibility of the IBM Watson Personality Insights
service. However, the matching idea comes clear, for example, when the prime minister
of Finland has nearly same values as Dalai Lama (Fig. 5).
When we offer experiences within products and services, then it is crucial to
understand that the consciousness of the human being is the wholeness of the expe-
riences [34] the contents of which can be qualitatively (consciousness degree, clarity
and linguistic to be indicated) different. The human being gets both intentional and
unintentional experiences [35]. The intentional experience can be either manifest or
latent. The manifest experience will be immediately understood. However, the latent
one can be evolved the manifest one, for example, with the help of teaching, growing,
psychotherapy or self-assessment. The human being is allowed and he should have
unintentional experiences, such as concerts, only for affecting his well-being state.
When we are going to arrange or offer experiences of different kind, for example, to our
customers, then we adapt the names of the expected experiences as follows: manifest
experiences are insightful ones, latent experiences are challenging ones, and uninten-
tional experiences are sensuous (even attractive) ones. When we map the traits
(Table 10) within the expected experiences of different kind then we will be
behavior-centric when we plan value propositions.
234 V. Hotti and U. Gain

Table 9. Example of adapted personality traits


Extremes of Celebrity match Personality Needs Values
Cautious Curious X
Easy Going Organized X
Reserved Outgoing X
Analytical Compassionate X
Confident Sensitive X
Ease Challenge X
Independence Belonging X
Fulfillment Exploration X
Calm Excitement X
Contentedness Acceptance X
Imperfection Perfection X
Restraint Freedom X
Introversion Extroversion X
Complacency Eagerness X
Obliviousness Identification X
Risk Stability X
Flexibility Structured X
Modernity Tradition X
Constancy Stimulation X
Stoicism Hedonism X
Non-conformity Conventionality X
Egoism Selflessness X

Fig. 5. Celebrity match between Juha Sipilä and Dalai Lama


Exploitation and Exploration Underpin Business 235

Table 10. Some personality traits mapped within experiences


Personality Explanation Insightful Challenging Sensuous
trait
Adventurousness Adventurous: “You are eager to x x x
experience new things”
Impulsiveness Hedonistic: “You feel your desires x
strongly and are easily tempted by
them”
Artistic interests Appreciative of art: “You enjoy beauty x
and seek out creative experiences”
Imagination Down-to-earth: “You prefer facts over x
fantasy”
Liberalism Authority-challenging: “You prefer to x
challenge authority and traditional
values to effect change”
Intellect Philosophical: “You are open to and x
intrigued by new ideas and love to
explore them”

4 Discussion

We found the zoo of performance indicators and even business analytics. Therefore, we
illustrated the business analytics mapping within exploitation and exploration. Fur-
thermore, we took two building blocks (i.e., value proposition and market performance)
to fulfill our construction in where we mapped the set of business analytics. When we
made our mappings between the business analytics and insights, we realized that the
most of the business analytics concerns the different stakeholders (e.g., customers,
employees and shareholders). Usually, when the organizations are interested in the
engagements of the stakeholders, then the personality insights are reasonable form to
them.
When the organization find out balance between exploitation and exploration, it can
allocate resources optimally. Hence, within data-driven optimization the understanding
organizational ambidexterity, i.e., the organization has to improve and invent at the
same time, are needed. The pressures for performance and conformance together with
prediction are growing when the organizations either compete or try to be more
effectiveness. Therefore, we believe that the understanding the possibilities of the
objective insights is crucial. However, the insights have to be actionable ones.
Therefore, we constructed both business analytics based mappings within the insights,
as well as, the personality traits mappings within the expected experiences (i.e.,
intentional and unintentional ones the names of which can be modified in value
proposition).
If the insights services are going to be used for value propositions, then the insights
have to be transparent, i.e., the explanations have to be explicit. Furthermore, the
organizations have to form their own points of view for the adaptions of the insights. In
future, both experiments and research are needed around the presented constructions.
236 V. Hotti and U. Gain

Especially, the traits of the IBM Watson Personality Insights service are difficult to
understand without explanations the reason of why we collected the explanations of
Big Five to this paper. The value proposition based on the personality insights might
get the new points of view as follows: if you are going to give even provocative
experiences then the authority-challenging personalities might be the best audience, if
your audience is philosophical ones then well-argument things might be reasonable to
offer with a straightforward style. However, the organizations have to make their own
experiments if they want to achieve either competitive advantages or effectiveness by
adapting the objective insights.

References
1. http://www.kdnuggets.com/2016/01/businesses-need-one-million-data-scientists-2018.html
2. Skibsted, L.A.: Cognitive Business (2016)
3. Lohr, S.: Data-Ism – Inside the Big Data Revolution. Harper Business, New York (2015)
4. ISO/IEC 2382:2015. Information Technology — Vocabulary. https://www.iso.org/obp/ui/
#iso:std:iso-iec:2382:ed-1:v1:en:term:2122971
5. IBM. https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/personality-
insights.html
6. IBM. https://personality-insights-livedemo.mybluemix.net/
7. IBM. https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/personality-
insights/
8. IBM. https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/personality-
insights/guidance.shtml
9. IBM. https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/personality-
insights/sample.shtml
10. IBM. https://developer.ibm.com/watson/blog/2016/01/12/personality-meets-behavior-the-
new-ibm-watson-personality-insights-demo/Batista
11. IBM. http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/personality-
insights/usecases.shtml
12. IBM. https://your-celebrity-match.mybluemix.net/
13. IBM. http://investment-advisor.mybluemix.net/
14. ISO 9000:2015 Quality Management Systems — Fundamentals and Vocabulary
15. Birkinshaw, J., Gupta, K.: Clarifying the distinctive contribution of ambidexterity to the field
of organization studies. Acad. Manag. Perspect. 27(4), 287–298 (2013)
16. Bøe-Lillegraven, T.: Untangling the ambidexterity dilemma through big data analytics.
J. Organ. Des. 3(3), 27–37 (2014)
17. Ahlemeyer-Stubbe, A., Coleman, S.: A Practical Guide to Data Mining for Business and
Industry, pp. 35–36. Wiley, Hoboken (2014)
18. Lanigan, R.L.: Capta versus data: method and evidence in communicology. Human Stud.
17(1), 109–130 (1994). Phenomenology in Communication Research
19. TOGAF. http://pubs.opengroup.org/architecture/togaf9-doc/arch/chap34.html
20. ISO/IEC 38500:2015. Information Technology — Governance of IT for the Organization.
https://www.iso.org/obp/ui#iso:std:iso-iec:38500:ed-2:v1:en:term:2.9
21. Marr, B.: The Intelligent Company: Five Steps to Success with Evidence-Based
Management. Wiley, Hoboken (2010)
Exploitation and Exploration Underpin Business 237

22. Marr, B.: Key Performance Indicators: The 75 Measures Every Manager Needs to Know.
Wiley, Hoboken (2012)
23. Marr, B.: Key Business Analytics – The 60 + Business Analytics Tools Every Managers
Need to Know. Wiley, Hoboken (2016)
24. Marr, B.: From Big Data to Real Business Value: The Smart Approach and 3 More
Use-Cases. Wiley (2016). http://www.cgma.org/Magazine/Features/Documents/SMART-
strategy-board.pdf
25. O’Really, C., Tushman, M.: Organizational ambidexterity: past, present, and future. Acad.
Manag. Perspect. 27(4), 324–328 (2013)
26. HBR. https://hbr.org/2004/04/the-ambidextrous-organization
27. BMC. https://canvanizer.com/new/business-model-canvas
28. Ash, M.: Why Lean Canvas vs Business Model Canvas? https://leanstack.com/why-lean-
canvas/
29. IBM. https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/personality-
insights/references.shtml#costa1992
30. IBM. https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/personality-
insights/basics.shtml
31. IBM. https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/personality-
insights/resources/PI-Facet-Characteristics.pdf
32. https://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/personality-
insights/models.shtml
33. IBM. https://www.ibm.com/cloud-computing/bluemix/watson/
34. Rauhala, L.: Ihminen kulttuurissa – kulttuuri ihmisessä. Gaudeamus (2009)
35. Rauhala, L.: Tajunnan itsepuolustus. Yliopistopaino (1995)
Paper III
Authors: Gain U, Hotti V, Lauronen H.

Year: 2017.

Article title: Automation capabilities challenge work activities cognitively

Journal: Futura, 36(2):25-35, ISSN-number: 0785-5494

Permissions from co-authors via email: Hotti Virpi received 9.11.2021 at 7:57; Lauronen Henna received 9.11.2021 at
22:17

Reproduced with permission in this thesis by Hazel Salminen, Tulevaisuuden tutkimuksen seura ry, received by email
12.11.2021 at 15:36
Referee article

Ulla Gain, Virpi Hotti and Henna Lauronen

Automation capabilities challenge work activities cognitively


Cognitive computing is today’s AI. The cognitive services are the building blocks of the cognitive
computing and they increase automation capabilities. The cognitive services mimic brain mechanisms,
for example, extracting the tones (e.g., emotions and social tendencies) or returning the ranked an-
swers to the queries. On this paper, we construct the capabilities of the cognitive services (i.e., IBM’s
Natural Language Understanding, Tone Analyzer, Personality Insights, Retrieve and Rank, TradeOff
Analytics and Discovery) to illustrate what we can do with the outcomes of the cognitive services.
Therefore, the outcomes of the cognitive services are mapped into 18 automation capabilities based
on the work activities. Furthermore, we present some supplements from the mapping between the
cognitive services and cognitive processes 52 cognitive processes of the human brains. Finally, we
construct the capabilities of the cognitive services with the utilization mind-set that illustrates what
we can do with the outcomes of the cognitive services.

Introduction the ability to apply the knowledge and use the


The automation has several definitions such know-how needed to carry out the tasks com-
as the “replacement of manual operations by prised in a particular job” (Skills Panorama
computerized methods” (ISO/IEC 2016) or 2017). However, our skills do not automatically
the “conversion of processes or equipment to mean that we are capable of performing all the
automatic operation, or the results of the con- work activities of the tasks.
version” (ISO/IEC 2015). However, deploying Nowadays, cognitive computing mimics the
automation means that something is going to brain mechanisms. The capabilities of cogni-
be done without direct manual intervention, or tive computing (today’s AI) can be crystallized
even without human intervention. The automa- as follows (Vorhies 2016): image and video
tion potential does not mean that we do not need processing, text and speech processing, and
employees. It means that the descriptions of the knowledge retrieval. The providers of cognitive
job or work are changing, or the work activities services divide they into groups as follows:
are performed in a new or even novel way. It is IBM (2017) divides the cognitive services of
even predicted that the impact of the automa- cognitive computing into four groups: featured,
tion might cover about 80% of the global labor language, speech, vision and data insights;
market in where a few occupations are fully Microsoft (2017) divides the cognitive services
automatable and 60% of all occupations have into the following groups: language, speech,
at least 30% technically automatable activities vision, knowledge and search.
(McKinsey 2017). However, the crystallized capabilities or
In the future dialogues of work, automation the groups of the APIs (e.g., IBM’s and Mi-
(e.g., cognitive services) is seen as a threat to crosoft’s cognitive services) do not illustrate
jobs. One reason for this is the fact that the how the cognitive services increase the cogni-
dialogues concern the jobs, works or occupa- tive abilities of the work activities. Hence, we
tions not the activities or tasks related to them. select the cognitive services by which we can
Furthermore, the required skills or capabilities illustrate comprehensively how they increase
of the work activities are at too common a the capabilities of the work activities. We ap-
level. For example, we might have skills such proach the focusing of the cognitive services
as “literacy, numeracy and problem solving” through their outcomes (Figure 1). In order to
(OECD 2016, 31) or our skills might refer “to understand the message of the article, please

25
2/17
study the following IBM Watson APIs by hout the reasons. Therefore, we must find out
opening the Internet sources below mentioned. whether the features of the cognitive services
IBM Natural Language Understanding (IBM are straightforward to map into McKinsey’s 18
2017a) analyzes semantic features (e.g., de- automation capabilities because of McKinsey’s
notations). IBM Tone Analyzer (IBM 2017b) capabilities based on work activities (Section
extracts tones that are emotions (i.e., anger, 2). Further, we present some supplements from
disgust, fear, joy, sadness), language styles the mapping between the cognitive services and
(i.e., analytical, confident, tentative) and so- cognitive processes (Section 3).
cial tendencies (i.e., openness, consciousness,
extraversion, agreeableness, emotional range).
Automation capabilities versus
IBM Personality Insights (IBM 2017c) helps cognitive services
to uncover personality traits and makes some McKinsey’s automation capabilities are used
consumption preferences. IBM Retrieve and when the capabilities of the cognitive services
Rank (IBM 2017d) returns the ranked answers are researched. The machine-learning algo-
to the queries. IBM TradeOff Analytics (IBM rithm has been used to score more than 2,000
2017e) identifies the best options based on work activities (e.g., greet customers, answer
multiple criteria. IBM Discovery (IBM 2017f) questions about products and services, and
explores information from the news in where demonstrate product features) of more than
private, third-party and public data are used. 800 occupations in relation to 18 automation
Our main aim is to construct the capabilities capabilities by matching keywords from the
of the cognitive services with the utilization capability to the activity title (McKinsey 2017,
mindset to illustrate what we can do with the 3, 21, 35, 123).
outcomes of cognitive services (Section 4). All work activity categories (e.g., managing
However, we cannot construct the actions wit- people, applying expertise, interfacing with


Figure 1. Outcomes of the cognitive services.

26
2/17
stakeholders, unpredictable physical activities, 9. INR. Information processing − Information
collecting data, processing data, predictable retrieval. Ability to “search and retrieve
physical activities) have automation poten- information from a large scale of sources
tial. However, the activity categories with the (breadth, depth, and degree of integrati-
highest automation potential as are follows on)”.
(McKinsey 2017): predictable physical activi- 10. COA. Information processing − Coordi-
ties 81%, processing data 69% and collecting nation with multiple agents. “Interact with
data 64%. The 18 automation capabilities of the others, including humans in, to coordinate
work activities within the criteria for automati- group activity”.
on (i.e., accept input, information processing, 11. SER. Information processing − Social and
deliver output, and physical movement) are as emotional reasoning. Ability to “accura-
follows (McKinsey 2017, 34−35, 120−123) tely draw conclusions about social and
within our letter abbreviations: emotional state, and determine appropriate
1. NLU. Accept input − Natural language response/action”.
understanding. Ability to “comprehend 12. OUT. Deliver output – Output articulation
language, including nuanced human in- / display. Ability to “deliver outputs /vi-
teraction”. sualizations across a variety of mediums
2. SEP. Accept input − Sensory perception. other than natural language” (i.e., “auto-
Ability to “autonomously infer and in- mated production of pictures, diagrams,
tegrate complex external perception using graphs, or mixed media presentations”).
sensors” (i.e., “visual perception, tactile 13. NLG. Deliver output – Natural language
sensing, and auditory sensing, and invol- generation. Ability to “deliver messages
ves complex external perception through in natural language, including nuanced
integrating and analyzing data from vario- human interaction and some quasi lan-
us sensors in the physical world”). guage (e.g., gestures)”.
3. SES. Accept input − Social and emotional 14. SEO. Deliver output – Emotional and social
sensing. Ability to “identify social and output. Ability to “produce emotionally
emotional state”. appropriate output (e.g., speech, body
4. RKP. Information processing − Recognizing language)”.
known patterns / categories (supervised 15. FMS. Physical movement – Fine motor
learning). Ability to “recognize simple / skills / dexterity. Ability to “manipulate
complex known patterns and categories objects with dexterity and sensitivity”.
other than sensory perception”. 16. GMS. Physical movement – Gross motor
5. GNP. Information processing − Generation skills. Ability to “move objects with mul-
of novel pattern / categories. Ability to tidimensional motor skills”.
“create and recognize new patterns/ca- 17. NAV. Physical movement – Navigation.
tegories (e.g., hypothesized categories)”. Ability to “autonomously navigate in
6. POS. Information processing − Logical various environments”.
reasoning / problem solving. Ability to 18. MOB. Physical movement – Mobility.
“solve problems in an organized way using Ability to “move within and across various
contextual information and increasingly environments and terrain”.
complex input variables other than opti- There are two natural language processing
mization and planning”. capabilities (NLG and NLU), three social and
7. OPP. Information processing − Optimization emotional capabilities (SES, SER and SEO),
and planning. Ability to “optimize and four physical capabilities (FMS, GMS, NAV
plan for objective outcomes across various and MOB) and eight cognitive capabilities
constraints”. (RKP, GNP, POS, OPP, CRE, INR, COA, and
8. CRE. Information processing – Creativity. OUT). There is only one sensory perception
Ability to “create the diverse and novel capability (SEP). It is observable that SEP and
ideas, or the novel combinations of ideas” RKP seem to be alternative to each other, as well
27
2/17
as POS and OPP. There is an overlap between capability (SEP) or the coordination with mul-
the capabilities, especially the cognitive ones tiple agents (COA) capability. Three services
(McKinsey 2017, 36). (IBM’s Natural Language Understanding, Tone
We cannot map cognitive services into Analyzer and Discovery) exact attitudes (i.e.,
automation capabilities without detailed infor- sentiments) and/or emotions. Therefore, they
mation of the services. Therefore, we have to fulfil two social and emotional capabilities (SES
approach the cognitive services through their and SER) without fulfilling the third social and
refined outcomes (Figure 2). We have to con- emotional capability (i.e., SEO). The cognitive
centrate the refining on whether the outcomes services are mapped into the cognitive capabili-
or parts of them are predefined. Furthermore, ties (i.e., RKP, INR, POS and CRE) based on
we want to know whether the service searches the following reasons:
and retrieves information from the data sources - IF the service extracts something predefined
while performing the service. Finally, we have (i.e., attitudes, emotions, tones or traits)
to find out whether unsupervised learning THEN it is mapped into RKP.
algorithms are used. - IF the service answers the questions or de-
The underlined phrases show the basis for termine the proposals (i.e., options) or
the mappings between the researched cognitive abstract the denotations (i.e., concepts)
services and automation capabilities (Table using the predefined data sources THEN
1). All researched services fulfil the natural it is mapped into INR.
language processing capabilities (i.e., NLG - IF the service determines something (e.g.,
and NLU), as well as the cognitive capability options) in an organized way using contex-
OUT. None of the researched services fulfil tual information and other input variables
the physical capabilities (i.e., FMS, GMS, THEN it is mapped into POS.
NAV and MOB), nor the sensory perception


Figure 2. Refined outcomes of the cognitive services for the automation capability mapping.

28
2/17
Table 1. Examples of the cognitive services and their mappings within the automation capabilities
Cognitive service Accept input Information Deliver output
processing
IBM Natural Language Un- NLU, SES RKP, SER, INR NLG, OUT
derstanding
IBM Tone Analyzer NLU, SES RKP, SER, CRE NLG, OUT
IBM Personality Insights NLU RKP, CRE NLG, OUT
IBM Retrieve and Rank NLU INR, POS NLG, OUT
IBM TradeOff Analytics NLU INR, OPP NLG, OUT
IBM Discovery NLU, SES RKP, SER, INR, POS NLG, OUT

- IF the service uses an unsupervised learning Wang (2015) has extended the model. The
algorithm THEN the service is mapped LRMB contains 52 cognitive processes (see the
into CRE appendix) mapped into seven mental functions
- IF the service optimizes and plans for objec- (i.e., sensation, action, memory, perception,
tive outcomes across various constraints cognition, inference, and intelligence). How-
THEN the service is mapped into OPP. ever, there are some cognitive processes the
Cognitive processes supplement meanings of which do not allow to replace them
with the automation capabilities. Furthermore,
automation capabilities
the capabilities of the information processing
Wang et al. (2006) have defined the layered might be included more capabilities based on
reference model of the brain (LRMB) and the cognitive processes from the cognition,


Figure 3. Refined outcomes of the cognitive services for the cognitive process mapping.

29
2/17
intelligence and perception functions. - Quantification – IF the service makes a scoring
The cognitive services (IBM’s Natural Lan- or evaluation of the reliability THEN it is
guage Understanding, Tone Analyzer and Per- mapped into the quantification process.
sonality) have already mapped on the cognitive - Decision making – IF the service binds an
processes the main aim which was to learn to action or actions THEN it is mapped into
review the effects of cognitive services on the the decision making process.
cognitive ability (Hotti & Lauronen 2017). We - Planning – IF the service generates some
cannot map cognitive services into cognitive textual proposals THEN it is mapped into
services without refining whether the outcomes the planning process.
or parts of them are classified, ranked or scored
(Figure 3). Furthermore, we will find out the
Cognitive computed actions and
written proposals (i.e., cited or collected state- utilization mind-set
ments such as summaries). In general, the cognitive services distill (and
When we combined the mappings (Table 2), quantify) actionable information or detailed (and
we realized that the recognizing of the known quantified) insights from audio, images, text,
patterns/categories capability (RKP) and so- and video. Furthermore, they enrich (discover,
cial and emotional reasoning (SES) could be infer, and integrate) the content with the detailed
replaced with the cognitive processes. Hence, (and quantified) insights or give the actionable
we supplemented the automation capabilities insights into the content. In our construction,
of the information process with the processes the services facilitate the outcomes, to revise
of the cognition and intelligence functions for something based on the outcomes or to manifest
the following reasons: something based on the outcomes. Further, we
- Abstraction – IF the service extracts the developed the capabilities of the cognitive ser-
predefined denotations (i.e., attitudes, vices with the actions bind, facilitate, manifest
emotions, entities, tones or traits) THEN and revise (Hotti & Lauronen 2017) as follows:
it is mapped into the abstraction process, - Bind. The Microsoft Language Understanding
the definition of which contains the term Intelligent Service describes the action
‘denotation’. binding results. If the cognitive services
- Concept establishment – IF the service solve the problems (i.e., the POS capabi-
constructs concepts THEN it is mapped lity) or make decisions (i.e., the decision
into the concept establishment process. making cognitive process) then we might
- Categorization – IF the service classifies use the bind verb.
something THEN it is mapped into the - Facilitate. When the outcomes are extracted
categorization process. (e.g., denotations or statements), then we

Table 2. Cognitive services versus the combined mapping within the automation capabilities
and cognitive services.
Cognitive service Accept input Information processing Deliver
output
IBM Natural Language NLU, SES INR, Abstraction, Classificaton, NLG, OUT
Understanding Concept establishment, Quantification
IBM Tone Analyzer NLU, SES CRE, Abstraction, Quantification NLG, OUT
IBM Personality NLU CRE, Abstraction, Quantification, NLG, OUT
Insights Planning
IBM Retrieve and Rank NLU INR, POS, Quantification, Planning NLG, OUT
IBM TradeOff Analytics NLU INR, OPP, Quantification, NLG, OUT
IBM Discovery NLU, SES INR, POS, Abstraction, Quantification NLG, OUT

30
2/17
can use them as inputs to the work activities those can be used to revise the language styles
on the following ones. of the text. IBM Personality Insights manifests
- Revise. The IBM Tone Analyzer service uses the traits by the consumption preferences and
the revise verb and we adapted the verb to behavioral characteristics. The IBM Retrieve
illustrate that the cognitive service can be and Rank service helps to facilitate the ranked
used iteratively to prepare, for example, answers in where the statements are separated
the text or the criteria for evaluating the with dots (i.e., the answers to the query). IBM
choices. TradeOff Analytics helps to revise the options
- Manifest. Sometimes it is even impossible to based on different criteria. IBM Discovery helps
perceive or to summarize the details, for to facilitate the news.
example, the personality traits. We used In the utilization mindset (Figure 4), the
the verb if it is obvious that the inferred cognitive computing offers possibilities for the
insights are going to predict the behavioral sensation function by Wang (2015) and Wang
outcomes. et al. (2006). The language-based services
Our researched services do not bind actions. processe textual inputs, the speech services are
However, IBM Natural Language Understand- employed to the voice (e.g., transform speech
ing helps to facilitate the denotations or some- to text) and the vision services accept pictorial
times even revise something, for example, by inputs (e.g., detect entities from the images). The
underlining sentences by the semantic roles cognitive services require detecting the motion,
(i.e., subject, action and object). IBM Tone for example, to be able to keep interactions be-
Analyzer extracts the tones of the text and tween the moving users such as SportsTracker

Figure 4. Example of the utilization mindset.

31
2/17
(2017) or MotionX (2017) data. Further, the they require to redefine the outcome of the
services (e.g., IoT services) are planned or built services, as well as, to find out the detailed
to sense smell, weight, pressure, heat, texture or information (e.g., the applied unsupervised
flavor device outcomes to improve or change algorithms). The cognitive services (i.e., IBM’s
the outcome of these matters. Natural Language Understanding, Tone Ana-
Stephen Few (2015) concludes that data is lyzer, Personality Insights, Retrieve and Rank,
a collection of facts – when a fact is true and TradeOff Analytics and Discovery) can be
useful (i.e., it must inform, matter and deserve mapped into McKinsey’s automation capa-
a respond) then it is a signal, otherwise it is bilities. When we supplemented McKinsey’s
noise. We will somehow underline that we automation capabilities within the cognitive
must carry out something within the actionable processes, we realized that not even the com-
insights. Therefore, the utilization of the insights bined mappings help in utilizing the cognitive
terminates only by the action. A continuum in services as work activities, i.e., what we can
a process after the input aimed at changing a do with the outcomes in general. Therefore,
process or situation (i.e., interventional) such we used the actions (i.e., bind, facilitate, revise
as improving or changing (e.g., to prevent and manifest) that illustrate what we can do
undesired developments) the outcome of these with the outcomes of the cognitive services
matters. The bind action is automated to handle (i.e., for utilizing the services). Furthermore,
the signals without human interventions. The we constructed the utilization mindset to con-
infer insights action uses, for example, the clude how the automation (e.g., the cognitive
business intelligence and analytics platforms services) releases resources for cognitively
(Gartner 2017a) and the data science platforms demanding activities or activities that are not
(Gartner 2017b). Furthermore, the data science possible or even reasonable to automate as the
platforms offer a mixture of building blocks activities that require a touch of human. In the
(e.g., cognitive services) for creating all kinds future, more discussions around the cognitive
of data science solutions for data handling. capabilities of the work activities are needed.
The inferred insights are either revisable or IBM Watson (IBM 2017g) related things
questionable. Thus, the actionable insights (i.e., (e.g., cognitive services) are usually associ-
signals) can be processed as the bound action. ated into the IBM Watson knowledge base
If the insights show the revisable items (e.g., which is considered expressed by the artificial
sentences with a certain style) then we will revise intelligence and which the experts, such as
the content and use the cognitive service again. doctors, will use in their work when they look
However, we will need some extra information for evidence-based answers to their questions
or knowledge. Therefore, we might need to use (i.e., the experts glean insights as the reasons
the knowledge retrieval services (e.g., IBM’s of their own decisions). The IBM Watson
Retrieve and Rank) or applications (e.g., IBM’s knowledge base has formed not only with
Watson within the knowledge base). If we do the help of artificial intelligence but also in
not use the cognitive services then we have to the forming of it and in the securing of the
make rule-based content categorizations, for validity of its contents experts are used to
example, to mine the attitudes or sentiments make interventions. However, the cognitive
from the text. We may not directly realize or services are used to generate cognitive ap-
even understand how much the techniques have plications to the separate purposes instead of
evolved in the cognitive services. the IBM Watson knowledge base utilization.
Therefore, instead of accuracy of the outcomes
Conclusion we concentrated assessing the mapping rules of
the cognitive services the meaning of which is
We realized that neither McKinsey’s automation to illustrate what we can do with the outcomes
capabilities nor the cognitive processes (Wang of the cognitive services. The accuracy of the
et al., 2016; Wand, 2015) helps to assess the cognitive service outcomes is complex because
cognitive services straightforwardly. However, there might not be mathematical relationships
32
2/17
between inputs and outcomes. For example, (Runeson et al. 2012, 71) the meaning of which
instead of mathematical relationships only the is checking the accuracy of the results. In this
percentiles are reported for Big Five dimensions paper, we assess the validity of the mapping
and facets (IBM 2017h). The cognitive services rules of the cognitive services the main aim of
are applied to the cognitive applications. For which is establishing what we can do with the
example, 113 Industries (2017) applies the IBM outcomes of the cognitive services as follows:
Personality Insights service for behavioral seg- - Construct validity. The used concepts and
mentation to target the consumers based on their their explanations are based on the cited
personality and buying behavior and Equals3 statements, as well as, the mapping rules
(2017) applies several cognitive services (e.g., based on the defined automation capabi-
Personality Insights and Tone Analyzer) for lities, cognitive processes and cognitive
in-depth cognitive market research, analysis services.
and segmentation. Furthermore, the scientists - Internal validity. The factors (i.e., cognitive
have used the cognitive services as a part of services) affect the studied factor (i.e.,
their research. For example, the personality automation capabilities and cognitive
traits have been inferred from the texts of the processes) when the mapping rules are
patients and social media users within IBM defined.
Personality Insights, and then the results are - External validity. The generalizing of the
compared statistically (Krishnamurhy et al, mapping rules requires more mappings in
2016). The personality traits can be inferred where more cognitive services are mapped
from the text of different kind and they can be into the outcome-based actions.
used to characterize, for example, the groups. When the matters of skills or capabilities are
The mapping process was iterative in where examined at the activity or task level instead
the authors co-operated and discussed the of the job, work or occupation level then it is
outcomes of the cognitive services, as well as, possible to plan the education of different kind
from the mapping rules. The interpretations of (e.g., work-based learning) to correspond the
the descriptions of the automation capabilities needed capabilities. Furthermore, the require-
and the definition of the cognitive processes ments of the capabilities (e.g., the automation
varied because the reasons for the descrip- capabilities) can be take into the consideration
tions and definitions are missing. However, precisely. We already know that there are both
the cited statements of the descriptions of the over education (people working in a job below
automation capabilities and the definition of their education level) and under education (peo-
the cognitive processes provided the oppor- ple working in a job above their education level).
tunity for the forming of the mapping rules Therefore, the capability-based investments in
within the experiments of the cognitive services education are needed. It is crucial to be familiar
demonstrations. The mappings rules based on with the capabilities of the cognitive computing.
co-operative work within the authors of this For example, technological developments (i.e.,
paper (i.e., first, everyone mapped the cogni- automation) “are seen to have strong impact
tive service into the automaton capabilities and on employment and demand for higher-level
cognitive services separately; second, the map- occupations”, “automated processes, robots and
ping rules are formalized; third, both mapping artificial intelligence can replace routine and
rules and mapping are focused in cooperation). data processing jobs and tasks, impacting both
Therefore, the mapping rules are claimed to be blue- and white-collar jobs” and “the increased
reliable because all three authors of this paper automation and robotization, are likely to af-
agree with them. The aspects of the validity fect occupational and qualifications structures”
are the construct, internal and external ones (Cedefop 2016, 17, 59).

33
2/17
References OECD (2016) Arntz M, Gregory T, Zierahn U. The Risk
of Automation for Jobs in OECD Countries: A Com-
113 (2017)113 Industries, http://www.113industries.com/ parative Analysis”, OECD Social, Employment and
technology/ibm-watson/. Migration Working Papers, No. 189, OECD Publis-
Cedefop (2016) Future skill needs in Europe: critical la- hing, 2016, http://www.oecd-ilibrary.org/docserver/
bour force trends. Publications Office of the European download/5jlz9h56dvq7-en.pdf?expires=148568862
Union, Cedefop research paper; No 59, 2016, http:// 2&id=id&accname=guest&checksum=B4FB119CC
www.cedefop.europa.eu/files/5559_en.pdf. 53C1684069093547189F241.
Equals3 (2017) Superpower your marketing process with Runeson P, Host M, Rainer A, Regnell B. (2012) Case
Lucy, Available from: http://www.equals3.ai/. Study Research in Software Engineering: Guidelines
Few S. (2015) Signal. ISBN: 978-1-938377-05-1. and Examples, John Wiley & Sons, Inc., Hoboken,
Gartner (2017a) ‘Magic Quadrant for Business Intelligence New Jersey.
and Analytics Platforms’. Skills Panorama (2017): About skills themes. European
Gartner (2017b) ‘Magic Quadrant for Data Science Plat- Commission 2017, Available from: http://skillspano-
forms’. rama.cedefop.europa.eu/en/skill#.
Hotti V., Lauronen H. (2017) Kognitiiviset prosessit ja Sports Tracker (2017): Sports Tracker Available from:
kognitiivisten palvelujen kohdentaminen. Accepted http://www.sports-tracker.com/.
to publish and present at the 20th National Research Vorhies W. (2016): What Can Modern Watson Do?, Data
Days for Social and Health Informatics, 22 May 2017. Science Central, November 15, 2016 Blog, Available
Helsinki, Finland. from: http://www.datasciencecentral.com/profiles/
IBM, (2017) Watson services. Available from: http://www. blogs/what-can-modern-watson-do.
ibm.com/watson/developercloud/services-catalog.html Wang Y., Wang Y., Patel S., Patel D. (2006) A layered refe-
IBM, (2017a) Natural Language understanding. Available rence model of the brain (LRMB). IEEE Transactions
from: https://www.ibm.com/watson/developercloud/ on Systems, Man, and Cybernetics, Part C (Applications
natural-language-understanding.html. and Reviews), 36(2), 124–133.
IBM, (2017b) Tone Analyzer, Available from: https://www. Wang Y. (2015) Formal Cognitive Models of Data, In-
ibm.com/watson/developercloud/tone-analyzer.html formation, Knowledge, and Intelligence. WSEAS
IBM, (2017c) Personality Insights, Available from: https:// Transactions on Computers, 14, 770–781.
www.ibm.com/watson/developercloud/personality-
insights.html.
IBM, (2017d) Retrieve and Rank, Available from: https://
www.ibm.com/watson/developercloud/retrieve-rank. Appendix: Cognitive processes
html. The sensation function “for detecting and acquiring cogni-
IBM, (2017e) Tradeoff Analytics, Available from: https:// tive information from the external world via physical
www.ibm.com/watson/developercloud/tradeoff-ana- and/or chemical means” (Wang et al, 2006) contains
lytics.html. eight cognitive processes the two of which (i.e., time
IBM, (2017f) Discovery, Available from: https://www.ibm. and spaciality) are not defined (Wang, 2015) and six
com/watson/developercloud/discovery.html. ones (vision, hearing, motion, smell, touch, and taste)
IBM, (2017g), How it works? Available from: https://www. are defined (Wang et al., 2006) as follows:
youtube.com/watch?v=_Xcmh1LQB9I. - Vision – “Detects and receives visual information
IBM (2017h) Personality Insights, Available from: https:// from the entities of the external world in the forms of
www.ibm.com/watson/developercloud/doc/personali- images, shapes, sizes, colors, and other attributes or
ty-insights/output.shtml#numeric. characteristics”
ISO/IEC 2382:2015(en) Information technology — Vo- - Hearing – “Audition . . . detects and receives aural
cabulary. Available from: https://www.iso.org/obp/ information from sources of the external world in
ui/#iso:std:iso-iec:2382:ed-1:v1:en:term:2121284. the forms of intensity, frequency, location, and other
ISO/IEC TR 13066-2:2016(en) Information technology — attributes and characteristics”
Interoperability with assistive technology (AT) — Part - Motion – “Sense of motion . . . detects and interprets
2: Windows accessibility application programming in- status changes related to space and time of external
terface (API), Available from: https://www.iso.org/obp/ objects or the observer himself in real time” (Wang et
ui/#iso:std:iso-iec:tr:13066:-2:ed-2:v1:en:term:2.7. al., 2006)
Krishnamurthy M., Mahmood K., Marcinek P. (2016): A - Smell – “Detects and receives scent by the olfactory
Hybrid Statistical and Semantic Model for Identifica- nerves from sources of the external world”
tion of Mental Health and Behavioral Disorders using - Touch – “Tactility… detects and receives touching
Social Network Analysis. IEEE/ACM International information by the contact between an external object
Conference on Advanced in Social Networks Analysis and a part of the body surface in the forms of heat,
and Mining (ASONAM), 1019–1026. pressure, weight, pain and texture”
McKinsey Global Institute (2017): A future that works: - Taste – “Detects and receives flavor information …
automation, employment, and productivity. January via the taste buds from sources of the external world”
2017, Available from: http://www.mckinsey.com/ The action function, the definition of which is “life functions
global-themes/digital-disruption/harnessing-automa- of the brain for output-oriented actions and motions
tion-for-a-future-that-works. that implement human behaviors, such as a sequence
Microsoft (2017) Discover API’s, Available from: https:// of movement and a prepared verbal sentence” (Wang
azure.microsoft.com/en-us/services/cognitive-servi- et al., 2006), contains four cognitive processes (Reflex,
ces/. Recurrent, Temporary and Complex) without definitions
Microsoft (2017) Language Understanding Intelligent (Wang, 2015). The memory function contains five
Services (LUIS), Available from: https://www.luis. cognitive processes as follows (Wang, 2015): long-term
ai/home/index. memory (LTM), short-term memory (STM), sensory
MotionX (2017) MotionX-GPS Available http://gps.mo- buffer memory (SBM), action buffer memory (ABM),
tionx.com/. and conscious status memory (CSM).

34
2/17
The perception function “for maintaining conscious life The inference function contains eight cognitive processes
functions and for browsing internal abstract memories the three of which (abduction, causation and recursion)
in the cognitive models of the brain” (Wang et al., 2006) are not defined (Wang, 2015) and five ones (deduction,
contains eight cognitive processes the three of which induction, analogy, analysis and synthesis) have the
(attitude, posture and equilibrium) are not defined following definitions (Wang et al., 2006):
(Wang, 2015) five ones (attention, consciousness, - Deduction – “By which a specific conclusion necessarily
motivation, emotion and imagination) are defines as follows from a set of general premises”
follows (Wang et al., 2006): - Induction – “By which a general conclusion is drawn
- Attention – “Focuses the mind, or the perceptive thin- from a set of specific premises based mainly on expe-
king engine, on one of the objects or threads of thought rience or experimental evidences”
by the selective concentration of consciousness” - Analogy – “Identifies similarity of the same relations
- Consciousness – “Self-consciousness . . . maintains between different domains or systems and/or examines
a stable mental state of human beings for sensation, that if two things agree in certain respects then they
perception, occurrent thought, and actions to function probably agree in others”
properly” - Analysis – “Divides a physical or abstract object into
- Motivation – “Explains the initiation, persistence, and its constitute parts in order to examine or determine
intensity of CPs [cognitive processes]” their relationship deductively”
- Emotion – “Emotions are a set of states or results of - Synthesis – “Combines objects or concepts into a
perception that interpret the feelings of human beings on complex whole inductively”
external stimuli or events in the categories of pleasant The intelligence function contains nine cognitive processes
or unpleasant, such as joy/worry, happiness/sadness, the one of which (modeling) is not defined and eight
safety/fear, and pleasure/angry” ones (comprehension, learning, problem solving, de-
- Imagination – “Imagery . . . abstractly sees acquired cision making, creation, planning, information fusion
visual images stored in the brain without any sensory and pattern recognition) have the following definition
input, or establish a relation between a mental image (Wang et al., 2006):
and the corresponding external entities or events” - Comprehension – “Searches relations between a given
The cognition function contains 10 cognitive processes object (O) or attribute (A) and other objects, attributes,
the four of which (i.e., the identification of the objects, and relations (R) in LTM, and establishes a representati-
comparison, qualification and selection) are not defined ve OAR model for the object or attribute by connecting
(Wang, 2015) and six ones (abstraction, concept estab- it to the appropriate clusters of the LTM”
lishment, categorization, memorization, quantification - Learning – “Gains knowledge of something or acqui-
and search) have the following definitions: res skills in some action or practice by updating the
- Abstraction – “Elicit a target subset of objects in a cognitive models of the brain in LTM”
given discourse that shares a common property as - Problem solving – “Searches a solution for a given
an identity of the subset from the whole in order to problem or finds a path to reach a given goal”
facilitate denotation and reasoning” (Wang, 2015)
- Decision making – “By which a preferred option or
- Concept establishment – “Constructs a “to be” relation course of action is chosen from among a set of alter-
between an object or its attributes and existing objects natives on the basis of given criteria”
/ attributes” (Wang et al., 2006).
- Creation – “Discovers a new relation between objects,
- Categorization – “Identifies common and equivalent attributes, concepts, phenomena, and events, which is
attributes or properties shared among a group of enti- original, proven true, and useful”
ties or objects and then uses the common attributes or
properties to identify this group of entities” (Wang et - Planning – “Generates abstract representations of future
al., 2006). actions, statuses, or paths to achieving a given goal,
based on current information”
- Memorization – “Encodes, stores, and retrieves infor-
mation in LTM, partially controlled by the subconscious - Information fusion – “Knowledge representation . . .
processes of sensation, memory, and perception” (Wang describes how information can be appropriately encoded
et al., 2006). and utilized in the cognitive models of the brain”
- Quantification – “Measures and specifies the quantity - Pattern recognition – “Recognition . . . identifies an
of an object or attribute by using a quantifier such as object by relating it to a concept or category, or com-
all, some, most, and none, or by using a more exact prehends a concept by known meanings”
rational measurement scale” (Wang et al., 2006).
- Search – “Based on trial-and-error explorations to find
a set of correlated objects, attributes, or relations for a
given object or concept; or to find useful solutions for
a given problem” (Wang et al., 2006).

35
2/17
Paper IV
Authors: Gain U, Hotti V.

Year: 2017.

Article title: Tones and traits - experiments of text-based extractions with cognitive services.

Journal: Finnish Journal of EHealth and EWelfare, 9(2-3):82-94

Permissions from co-authors via email: Hotti Virpi received 9.11.2021 at 7:57

Reproduced with permission in this thesis


SCIENTIFIC PAPERS

Tones and traits - experiments of text-based extractions


with cognitive services
Ulla Gain, Virpi Hotti

School of Computing, University of Eastern Finland, Kuopio, Finland

Ulla Gain, School of Computing, University of Eastern Finland, FI-70211, Kuopio, FINLAND. Email: [email protected]

Abstract

Cognitive services mimic brain mechanisms by the algorithms, for example, extracting emotions, language styles,
and social tendencies from text. The cognitive services offer the inferred insights that are even more detailed than
the observation-based ground-truth data. In this paper, the text-based extractions are utilized by the cognitive
services (i.e., Microsoft’s Text Analytics and IBM’s AlchemyLanguage, Natural Language Understanding, Personality
Insights and Tone Analyzer). The tones (i.e., emotions, language styles and social tendencies) and personality traits
(i.e., social tendencies, needs and values) are extracted from a web-site document with no time by the cognitive
services the algorithmic functionality of which are exemplified. Both the tones and personality traits are the in-
ferred insights that can be used to assess the personality. Finally, 47 personality traits are tabulated to illustrate
the trait-based questions that can be used in the interventions.

Keywords: algorithms, cognitive function, personality assessment

Introduction dio, images, text and video) then we need the cognitive
services. The cognitive services give us the capabilities
There are dozens cognitive services (i.e., cognitive to get insights that are more objective. However, it has
building blocks) that can be used to illustrate the auto- been found that the characteristics (e.g., personality
mation capabilities. We already use the cognitive build- traits) inferred from text can reliably predict a variety of
ing blocks even without thinking that they are cognitive real-world behavior [3]. Therefore, for example, the
ones. For example, we dictate to iPhone and it converts IBM Watson Personality Insights service provides a list
the speech into a text. However, when we examine the of the behaviors (e.g., preferences [4]) that the person-
cognitive services more precisely, we will realize that ality is likely (e.g., treat yourself) or unlikely (e.g., put
language, speech and vision are the main cognized health at risk) to manifest.
domains of the services. For example, IBM has divided
their cognitive services into four groups (i.e., language, It was the dream (e.g., “uncover the “information nug-
speech, vision and data insights) [1], whereas, Microsoft gets” [5], “digging into the data . . . to find out relevant
divides their cognitive services into five groups (lan- items of the data” [6] or make the rule-based content
guage, speech, vision, knowledge and search) [2]. categorization [7]) to get something useful from text
cognitively until the IBM Watson Personality Insights
When we want to get the objective, or inferred, insights service manifests with the text-based personality in-
from the unstructured data of different kind (e.g., au- sights [8]. For example, the healthcare utilizations of

22.5.2017 FinJeHeW 2017;9(2–3) 82


SCIENTIFIC PAPERS

the personality insights may differ from self-study to Material and methods
personalized services [8] (e.g., matching individuals
such as doctor patient matching because patients pre- The cognitive services can constitute a flexible building
fer doctors who are similar to themselves, monitoring block assemblage. There are only two cognitive services
and predicting mental health such as predicting post- (i.e., IBM Tone Analyzer and Personality Insights) to
partum and other forms of depression from social me- extract the social tendencies. Personality characteristics
dia [9]). During the research of the cognitive services, (i.e., social tendencies, needs and values) based on an
we found only two services the extractions of which open vocabulary approach. The input text is tokenized
concern the social tendencies [10,11]. Hence, we found, and transformed to a representation in a n-dimensional
for example, the global online platform called Talkspace space, and the service use an open source word em-
that uses the Personality Insights “to better match users bedding technique called GloVe to obtain a vector rep-
with therapists in their network using a self-learning resentation for the words in the input text [10].
system that seeks to better understand the traits of
Sometimes the web pages that have many advertise-
individual” and the platform “allows users to chat with
ments might contain the crucial arguments, especially,
a licensed therapist confidentially and anonymously”
when the new technology is adapted. In this paper, we
[12].
used the URL of the Togaf 9.1. Content Metamodel [14]
The personality traits (e.g., extraversion and openness) the content of which is used to exemplify the extrac-
are heritable and associated with the behavioral out- tions of a different kind (i.e., emotions, language styles,
comes. For example, extraversion is associated with social tendencies, needs and values), as well as, to illus-
"psychosocial, lifestyle and health outcomes such as trate the outcomes of the cognitive services with mini-
academic and job performance, well-being, obesity, mum human interventions.
substance use, physical activity, bipolar disorder, bor-
First, we combined the cognitive services of two pro-
derline personality disorder, Alzheimer’s disease, and
viders (i.e., IBM and Microsoft). However, the demon-
longevity” [13].
stration of the Microsoft Text Analytics API [15] did not
In this paper, we report the experiments of the text- summarize the key phrases without text manipulations
based extractions including the tones (i.e., emotions, (i.e., the human intervention is needed). Therefore, the
language styles and social tendencies) and traits (i.e., Text Analytics service is excluded from our final con-
social tendencies and their facets, needs and values). struction (Figure 1). The detailed workflow of the uti-
One of our main aims is to find out the argumentations lized cognitive services is the follows:
for the extractions. Therefore, we construct the work-
 IBM AlchemyLanguage – Paste URL, Select Text
flow for value propositions as an orchestrated attempt
Extraction, Copy text
by the cognitive services the outcomes of which mani-
fest the behavioral outcomes or help to revise the lan-  IBM Tone Analyzer – Paste Your own text, Click
guage styles. Analyze, Identify sentences with stronger
tones in context or sorted by score

 IBM Personality Insights – Paste Your own text,


Click Analyze, Identify personality traits (i.e.,
social tendencies / personality types and their
facets, need, and values).

22.5.2017 FinJeHeW 2017;9(2–3) 83


SCIENTIFIC PAPERS

Figure 1. Construction for text extraction (drawn by Microsoft Vision).

Finally, we will try to explain the results of the extrac-  IF the action is are THEN the question starts by
tions. When we tried to formalize the relationships Am (see the rule 1)
between traits and summaries, we found that there is
 IF the object contains your THEN the question
no mathematical relationship exists, for example, be-
contains my (see the rule 1)
tween “the percentiles reported for Big Five dimensions
and facets” [16]. Therefore, it is impossible to compare  IF ahead of the action there is the expression or
whether the cognitive services offer the even more term THEN it will be included in the question
detailed explanations of the calculated results than the (see the rule 2)
results of the on-line personality questionnaires based
 IF ahead of the action there is are THEN it will be
on a public domain collection of the IPIP items for use
included in the question (see the rule 3 and the
in personality tests. The International Personality Item
rule 4)
Pool (IPIP) offers, for example, 50 personality questions
the calculation of which is explained [17] (Appendix 1).  IF the sentence contains yourself THEN the ques-
tion contains myself (see the rule 4)
The IPIP questions illustrate the simplicity of the per-
sonality questionnaires whereas the explanations of the  IF the sentence contains two or more actions
traits are complex ones. Therefore, we try to form the THEN the question starts by the first action (see
trait-based questions to support, for example, the trait- the rule 5)
based intervention. We use the IBM Natural Language  IF the sentence contain two or more you THEN
Understanding service [18] to extract the semantic roles the meaning of the sentence affects the re-
(i.e., subject, action and object) of the sentences (i.e., placements (i.e., me, I or I am if there is your
the explanations of the traits). We use the semantic are).
roles to exemplify the trait-based personality question-
naire as follows (Figure 2):

22.5.2017 FinJeHeW 2017;9(2–3) 84


SCIENTIFIC PAPERS

Figure 2. Rules of the trait-based questions (drawn by Microsoft Vision).

If the trade-based questions are answered in scoring Results


from zero to one hundred (i.e., zero percent correspond
to the trait that does not exist and one hundred the The Tone Analyzer service produces the tones (i.e.,
opposite to it) then the straight comparability of results emotions, language styles and social tendencies) at the
remains (i.e., between the results scores of the IBM document (Figure 3). The tone scores indicate the
Personality Insights and the answers of the trade-based probability of the tones. The emotion tones are anger,
questions). disgust, fear, joy and sadness. The language tone de-
scribes the writing style the categories of which are
analytical, confident and tentative. The social tones
(openness, conscientiousness, extraversion, agreeable-
ness and emotional range) are adapted from the big-
five personality model.

22.5.2017 FinJeHeW 2017;9(2–3) 85


SCIENTIFIC PAPERS

Figure 3. Tones at the document level.

The sentences are highlighted color-based by the se-  Conscientiousness “is a person's tendency to act in
lected tone (Figure 4). Furthermore, it is possible to an organized or thoughtful way”,
revise the tones of the sentences in context or ranked  Extraversion “is a person's tendency to seek stimu-
by the score. The limits of the scores are observable lation in the company of others”,
(i.e., < 5 %, 5 – 75 % or > 75 %) [19]. The model includes  Agreeableness “is a person's tendency to be com-
five primary dimensions (i.e., social tendencies) as fol- passionate and cooperative toward others”,
lows [20]:  Emotional Range, “also referred to as Neuroticism
or Natural Reactions, is the extent to which a per-
 Openness “is the extent to which a person is open
son's emotions are sensitive to the person's envi-
to experiencing a variety of activities”,
ronment”.

Figure 4. Tones at the sentence level.

22.5.2017 FinJeHeW 2017;9(2–3) 86


SCIENTIFIC PAPERS

Figure 5. Example of summary from the IBM Personality Insights service.

The Personality Insights service generates the summary following the traits in the sunburst (Figure 6). Further-
(Figure 5) and sunburst from the inferred personality more, we have use the table to highlights the text-
traits. Furthermore, it gives the short descriptions for based trait finding because each of them can be evalu-
the traits. However, we do not use them because they ated separately.
do not explain the high or low scores of the traits.
We did not find out the explanations of the summary
The primary and secondary dimensions affect the first for the traits of the need (the 3rd paragraph of the
sentence of the summary (i.e., the first paragraph) [21]. summary) and values (the 4th paragraph of the sum-
The second paragraph of the summary is generated mary). However, we tabulate the found definitions of
using the three main facets of the primary dimensions the needs and values. The needs of the author (Table 1)
[22]. However, it is not explained explicitly how the resonate the aspects of the products or services. The
certain adjectives (i.e., shrewd and critical) of the first values describe the human’s (or the author’s) motivat-
sentence and the statements of the second paragraph ing factors that influence decision-making. There are
are selected. only explanations for high values (Table 2). Further-
more, we found only one research [23] in where two
There are 60 explanations for 30 facets (i.e., traits) of traits are mapped explicitly the behavioral outcomes,
five dimensions and we tabulated them (Appendix 2) by and the research results are statistically significant.

22.5.2017 FinJeHeW 2017;9(2–3) 87


SCIENTIFIC PAPERS

Figure 6. Example of the sunburst generated by the IBM Watson Personality Insight service [10].

Table 1. Example of the needs the definitions of which are adapted [24].

Trait High definitions


Challenge “Have an urge to achieve, to succeed, and to take on challenges”
Closeness “Relish being connected to family and setting up a home”
Curiosity “Have a desire to discover, find out, and grow”
Excitement “Want to get out there and live life, have upbeat emotions, and want to have fun”
Harmony “Appreciate other people, their viewpoints, and their feelings”
Ideal “Desire perfection and a sense of community”
Liberty “Have a desire for fashion and new things, as well as the need for escape”
Love “Enjoy social contact, whether one-to-one or one-to-many. Any brand that is involved in bringing people
together taps this need”
Practicality “Have a desire to get the job done, a desire for skill and efficiency, which can include physical expression
and experience”
Self-expression “Enjoy discovering and asserting . . . own identities”
Stability “Seek equivalence in the physical world . . . favor the sensible, the tried and tested”
Structure “Exhibit groundedness and a desire to hold things together . . . need things to be well organized and under
control”

22.5.2017 FinJeHeW 2017;9(2–3) 88


SCIENTIFIC PAPERS

Table 2. Example of the values the definitions of which are adapted [25].
Trait High definitions Examples of behavioral outcomes
Conservation “Emphasize self-restriction, order, and resistance to change”
Openness to change “Emphasize independent action, thought, and feeling, as
well as a readiness for new experiences”
Hedonism “Seek pleasure and sensuous gratification”
Self-enhancement “Seek personal success” Read articles about the work
Self-transcendence “Show concern for the welfare and interests of others” Read articles about the environment

Finally, we formed the trait-based questions (Table 1) The cognitive services can accumulate our cognition.
using the rules the specifications of which based on the For example, we can extract the tones (i.e., emotions,
denotations of the IBM Natural Language Understand- language styles and social tendencies) and personality
ing service. Although, our rules generate some gram- traits (i.e., social tendencies and their facets, needs and
matical issues (e.g., am aware instead of 'am I aware'), values) from a web-site document with no time. We can
the results remain recognizable. Further, we realized use the outcomes (e.g., tones) to revise our text or we
that the definitions of the needs and values follow our will use the facilitated denotations (e.g., semantic
rules. However, the explanations of the facets are roles).
transformed into the question format. Instead of the
summary, the traits and their definitions or explana- Usually, when we talk about the personality traits (e.g.,
tions offer the starting point for the discussions during social tendencies, needs and values) then we trust the
the interventions. Therefore, the tables are useful to ground-truth data without knowing that the inferred
interpret the results of the IBM Watson Personality insights are more detailed ones. Traditionally, the per-
Insights service. Furthermore, 47 traits open the fea- sonality traits are surveyed to get the ground-truth data
tures of the human being and they even explain the (e.g., IPIP). Despite the lack of the studies to constitute
constraints for collaboration of different kind. the heritable nature (i.e., the modes of gene action) of
each personality trait, the traits should be taken into to
consideration during interventions, as well as, in pre-
Discussion ventive operations.

The cognitive services can be consciously used decrease Before the professionals such as psychotherapists bring
the human interventions if the argumentations of the the IBM Personality Insights service into use, the func-
cognitive services are acceptable. Therefore, the argu- tionality of the algorithm has to be verified together
mentation is needed, i.e., how extractions (e.g., emo- with substance professionals. Furthermore, more re-
tions, language styles, social tendencies, needs and search has to be done to conduct the behavioral out-
values) can be explained, or even reached, through comes within the personality traits. However, the IBM
logical reasoning. The intentional and unintentional Personality Insights service can be used within the in-
experiences of the authors affect the linguistic of which terventions. Our established tables of the traits can be
can be indicated (e.g., whether the sentences are ana- use when the results of the Personality Insights service
lytical, confident or tentative). Moreover, the research are interpreted.
background of the cognitive services has to be explicitly
The inferred personality traits are the evidence-based
figured out when the cognitive services and their appli-
(i.e., spoken or written text) insights of the conscious-
cations are used to predict behavioral outcomes and to
ness of the human being. The intentional and uninten-
provide recommendations.
tional experiences affect the consciousness of the hu-
man being the degree, clarity and linguistic of which can

22.5.2017 FinJeHeW 2017;9(2–3) 89


SCIENTIFIC PAPERS

be indicated. In future, we will get the non-repudiation [11] IBM. Tone Analyzer [cited 2017 Mar 30]. Available
data through the gene tests. Hence, the inferred per- from: https://www.ibm.com/watson/developercloud/
sonality traits can be used to predict the behavioral tone-analyzer.html
outcomes.
[12] Vorhies W. 30 Fun Ideas for Starting New AI Busi-
nesses and Services with Watson, [cited 2017 Mar 30].
References Available from: http://www.datasciencecentral.com/
profiles/blogs/30-fun-ideas-for-starting-new-ai-
[1] IBM. Watson services [cited 2017 Mar 30]. Available businesses-and-services-with-wat
from: http://www.ibm.com/watson/developercloud/ [13] van den Berg SM., de Moor MHM, Verweij KJH,
services-catalog.html Krueger RF, Luciano M, Vasquez AA, et al. Meta-analysis
[2] Microsoft. Discover API’s [cited 2017 Mar 30]. Avail- of Genome-Wide Association Studies for Extraversion:
able from: https://www.microsoft.com/cognitive- Findings from the Genetics of Personality Consortium.
services/en-us/apis Behavior Genetics. 2016;46(2):170–82.
https://doi.org/10.1007/s10519-015-9735-5
[3] IBM. How precise is the service [cited 2017 Mar 30].
Available from: http://www.ibm.com/watson/ [14] The Open Group. TOGAF, Content Metamodel
developercloud/doc/personality- [cited 2017 Mar 30]. Available from:
insights/science.shtml#researchPrecise http://pubs.opengroup.org/architecture/togaf9-
doc/arch/chap34.html#tag_34
[4] IBM. Consumption preferences [cited 2017 Mar 30].
Available from: http://www.ibm.com/watson/ [15] Microsoft. Text Analytics API [cited 2017 Mar 30].
developercloud/doc/personality- Available from: https://www.microsoft.com/cognitive-
insights/preferences.shtml services/en-us/text-analytics-api

[5] Hotti V, Gain U. Big Data Analytics for Professionals, [16] IBM. Personality Insights [cited 2017 Mar 30].
Data-milling for Laypeople. World Journal of Computer Available from: https://www.ibm.com/watson/
Application and Technology, 2013;1(2);51–57. developercloud/doc/personality-
insights/output.shtml#numeric
[6] Hotti V, Gain U, Lintula H, Puumalainen A, Salomaa
H. Construction of business-driven capta processing. [17] The International Personality Item Pool. IPIP [cited
Advanced Research in Scientific Areas, EDIS - Publishing 2017 Mar 30]. Available from: http://ipip.ori.org/
Institution of the University of Zilina. 2014;3(1). newScoringInstructions.htm

[7] Hotti V. Sisällön tulkinta haastaa big datan louhijat. [18] IBM. Natural Language Understanding [cited 2017
Futura. 2016;2:70–9. Mar 30]. Available from: https://www.ibm.com/
watson/developercloud/natural-language-
[8] Hotti V, Gain U. Exploitation and exploration under-
understanding.html
pin business and insights underpin business analytics.
6th International Conference on Well-Being in the In- [19] IBM. Emotional tone [cited 2017 Mar 30]. Available
formation Society, 2016, p. 223–37. from: http://www.ibm.com/watson/developercloud/
doc/tone-analyzer/understand-tone.html#emotional-
[9] IBM. Use cases [cited 2017 Mar 30]. Available from:
tone
http://www.ibm.com/smarterplanet/us/en/ibmwatson
/developercloud/doc/personality- [20] IBM. Personality Insights basics [cited 2017 Mar
insights/usecases.shtml 30]. Available from: http://www.ibm.com/watson/
developercloud/doc/personality-insights/basics.shtml
[10] IBM. Personality Insights [cited 2017 Mar 30].
Available from: https://www.ibm.com/watson/ [21] IBM. Big Five Personality Dimensions: Characteris-
developercloud/personality-insights.html tics of Individuals with High- and Low-Value Combina-

22.5.2017 FinJeHeW 2017;9(2–3) 90


SCIENTIFIC PAPERS

tions of Dimensions [cited 2017 Mar 30]. Available [23] IBM. The service in action [cited 2017 Mar 30].
from: https://github.com/watson-developer-cloud/doc- Available from: http://www.ibm.com/watson/
tutorial-downloads/raw/master/personality- developercloud/doc/personality-insights/applied.shtml
insights/Personality-Insights-Dimension-
[24] IBM. Needs [cited 2017 Mar 30]. Available from:
Characteristics.pdf
https://www.ibm.com/watson/developercloud/doc/per
[22] IBM. Personality models [cited 2017 Mar 30]. sonality-insights/models.shtml#outputNeeds
Available from: https://www.ibm.com/watson/
[25] IBM. Values [cited 2017 Mar 30]. Available from:
developercloud/doc/personality-insights/models.shtml
http://www.ibm.com/watson/developercloud/doc/pers
onality-insights/models.shtml#outputValues

22.5.2017 FinJeHeW 2017;9(2–3) 91


SCIENTIFIC PAPERS

Appendix 1. Dimensions and IPIP questions, as well as, the scores of the answers.
O=Openness,C=Conscientiousness,E=Extraversion,A=Agreeableness,E=Emotional Range,#Q=Number of Ques-
tions,VEI=Very inaccurate,MOI=Moderately inaccurate,INA=Neither inaccurate not accurate,MOA=Moderately
accurate,VEA=Very accurate.

O C E A ER #Q Cited ITIP questions [17] VEI MOI INA MOA VEA


x 6. Don't talk a lot. 1 2 3 4 5
x 16. Keep in the background. 1 2 3 4 5
x 26. Have little to say. 1 2 3 4 5
x 36. Don't like to draw attention to myself. 1 2 3 4 5
x 46. Am quiet around strangers. 1 2 3 4 5
x 1. Am the life of the party. 5 4 3 2 1
x 11. Feel comfortable around people. 5 4 3 2 1
x 21. Start conversations. 5 4 3 2 1
x 31. Talk to a lot of different people at parties. 5 4 3 2 1
x 41. Don't mind being the center of attention. 5 4 3 2 1
x 2. Feel little concern for others. 1 2 3 4 5
x 12. Insult people. 1 2 3 4 5
x 22. Am not interested in other people's problems. 1 2 3 4 5
x 32. Am not really interested in others. 1 2 3 4 5
x 7. Am interested in people. 5 4 3 2 1
x 17. Sympathize with others' feelings. 5 4 3 2 1
x 27. Have a soft heart. 5 4 3 2 1
x 37. Take time out for others. 5 4 3 2 1
x 42. Feel others' emotions. 5 4 3 2 1
x 47. Make people feel at ease. 5 4 3 2 1
x 8. Leave my belongings around. 1 2 3 4 5
x 18. Make a mess of things. 1 2 3 4 5
x 28. Often forget to put things back in their proper place. 1 2 3 4 5
x 38. Shirk my duties. 1 2 3 4 5
x 3. Am always prepared. 5 4 3 2 1
x 13. Pay attention to details. 5 4 3 2 1
x 23. Get chores done right away. 5 4 3 2 1
x 33. Like order. 5 4 3 2 1
x 43. Follow a schedule. 5 4 3 2 1
x 48. Am exacting in my work. 5 4 3 2 1
x 4. Get stressed out easily. 1 2 3 4 5
x 14. Worry about things. 1 2 3 4 5
x 24. Am easily disturbed. 1 2 3 4 5
x 29. Get upset easily. 1 2 3 4 5
x 34. Change my mood a lot. 1 2 3 4 5
x 39. Have frequent mood swings. 1 2 3 4 5
x 44. Get irritated easily. 1 2 3 4 5
x 49. Often feel blue. 1 2 3 4 5
x 9. Am relaxed most of the time. 5 4 3 2 1
x 19. Seldom feel blue. 5 4 3 2 1
x 10. Have difficulty understanding abstract ideas. 1 2 3 4 5
x 20. Am not interested in abstract ideas. 1 2 3 4 5
x 30. Do not have a good imagination. 1 2 3 4 5
x 5. Have a rich vocabulary. 5 4 3 2 1
x 15. Have a vivid imagination. 5 4 3 2 1
x 25. Have excellent ideas. 5 4 3 2 1
x 35. Am quick to understand things. 5 4 3 2 1
x 40. Use difficult words. 5 4 3 2 1
x 45. Spend time reflecting on things. 5 4 3 2 1
x 50. Am full of ideas. 5 4 3 2 1

22.5.2017 FinJeHeW 2017;9(2–3) 92


SCIENTIFIC PAPERS

Appendix 2. Personality traits. Facets the explanations of which are adapted [10].

Facet / trait High explanation Low explanation


Adventurousness Adventurous: “You are eager to experience new Consistent: “You enjoy familiar routines and prefer not to
things” deviate from them”
Artistic interests Appreciative of art: “You enjoy beauty and seek out Unconcerned with art: “You are less concerned with artistic or
creative experiences” creative activities than most people”
Emotionality Emotionally aware: “You are aware of your feelings Dispassionate: “You do not frequently think about or openly
and how to express them” express your emotions”
Imagination Imaginative: “You have a wild imagination” Down-to-earth: “You prefer facts over fantasy”
Intellect Philosophical: “You are open to and intrigued by new Concrete: “You prefer dealing with the world as it is, rarely
ideas and love to explore them” considering abstract ideas”
Authority- Authority-challenging: “You prefer to challenge au- Respectful of authority: “You prefer following with tradition to
challenging thority and traditional values to effect change” maintain a sense of stability”
Achievement Driven: “You set high goals for yourself and work hard Content: “You are content with your level of accomplishment
striving to achieve them” and do not feel the need to set ambitious goals”
Cautiousness Deliberate: “You carefully think through decisions Bold: “You would rather take action immediately than spend
before making them” time deliberating making a decision”
Dutifulness Dutiful: “You take rules and obligations seriously, even Carefree: “You do what you want, disregarding rules and
when they are inconvenient” obligations”
Orderliness Organized: “You feel a strong need for structure in Unstructured: “You do not make a lot of time for organization
your life” in your daily life”
Self-discipline Persistent: “You can tackle and stick with tough tasks” Intermittent: “You have a hard time sticking with difficult tasks
for a long period of time”
Self-efficacy Self-assured: “You feel you have the ability to succeed Self-doubting: “You frequently doubt your ability to achieve
in the tasks you set out to do” your goals”
Activity level Energetic: “You enjoy a fast-paced, busy schedule with Laid-back: “You appreciate a relaxed pace in life”
many activities”
Assertiveness Assertive: “You tend to speak up and take charge of Demure: “You prefer to listen than to talk, especially in group
situations, and you are comfortable leading groups” situations”
Cheerfulness Cheerful: “You are a joyful person and share that joy Solemn: “You are generally serious and do not joke much”
with the world”
Excitement- Excitement-seeking: “You are excited by taking risks Calm-seeking: “You prefer activities that are quiet, calm, and
seeking and feel bored without lots of action going on” safe”
Outgoing Outgoing: “You make friends easily and feel comforta- Reserved: “You are a private person and do not let many
ble around other people” people in”
Gregariousness Sociable: “You enjoy being in the company of others” Independent: “You have a strong desire to have time to your-
self”
Altruism Altruistic: “You feel fulfilled when helping others and Self-focused: “You are more concerned with taking care of
will go out of your way to do so” yourself than taking time for others”
Cooperation Accommodating: “You are easy to please and try to Contrary: “You do not shy away from contradicting others”
avoid confrontation”
Modesty Modest: “You are uncomfortable being the center of Proud: “You hold yourself in high regard and are satisfied with
attention” who you are”
Uncompromising Uncompromising: “You think it is wrong to take ad- Compromising: “You are comfortable using every trick in the
vantage of others to get ahead” book to get what you want”
Sympathy Empathetic: “You feel what others feel and are com- Hard-hearted: “You think people should generally rely more on
passionate toward them” themselves than on others”
Trust Trusting of others: “You believe the best in others and Cautious of others: “You are wary of other people's intentions
trust people easily” and do not trust easily”
Fiery Fiery: “You have a fiery temper, especially when things Mild-tempered: “It takes a lot to get you angry”
do not go your way”
Prone to worry Prone to worry: “You tend to worry about things that Self-assured: “You tend to feel calm and self-assured”
might happen”
Melancholy Melancholy: “You think quite often about the things Content: “You are generally comfortable with yourself as you
you are unhappy about” are”
Immoderation Hedonistic: “You feel your desires strongly and are Self-controlled: “You have control over your desires, which are
easily tempted by them” not particularly intense”
Self- Self-conscious: “You are sensitive about what others Confident: “You are hard to embarrass and are self-confident
consciousness might be thinking of you” most of the time”
Susceptible to Susceptible to stress: “You are easily overwhelmed in Calm under pressure: “You handle unexpected events calmly
stress stressful situations” and effectively”

22.5.2017 FinJeHeW 2017;9(2–3) 93


Paper V
Authors: Gain U.

Year: 2020.

Article title: The cognitive function and the framework of the functional hierarchy.

Journal: Applied Computing and Informatics, Emerald Publishing Limited

Copyright:

DOI: https://doi.org/10.1016/j.aci.2018.03.003
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/2210-8327.htm

The cognitive function and the Functional


hierarchy
framework of the
functional hierarchy
Ulla Gain 81
School of Computing, University of Eastern Finland, Kuopio, Finland
Received 19 October 2017
Revised 13 January 2018
Abstract Accepted 12 March 2018
Cognitive computing is part of AI and cognitive applications consists of cognitive services, which are building
blocks of the cognitive systems. These applications mimic the human brain functions, for example, recognize
the speaker, sense the tone of the text. On this paper, we present the similarities of these with human cognitive
functions. We establish a framework which gathers cognitive functions into nine intentional processes from the
substructures of the human brain. The framework, underpins human cognitive functions, and categorizes
cognitive computing functions into the functional hierarchy, through which we present the functional
similarities between cognitive service and human cognitive functions to illustrate what kind of functions are
cognitive in the computing. The results from the comparison of the functional hierarchy of cognitive functions
are consistent with cognitive computing literature. Thus, the functional hierarchy allows us to find the type of
cognition and reach the comparability between the applications.
Keywords Cognitive function, Human cognitive functions, Framework of the functional hierarchy,
Cognitive service, Cognitive computing
Paper type Original Article

1. Introduction
On the computing domain word ‘cognitive’ has become a common word. Since, cognitive
computing approach the cognition and it imitates human cognitive processing, it is essential
to explicate correspondences between human cognitive functions and the functions in
cognitive computing. Furthermore, when this close similarity will be clarified, then it helps us
identify, use and classify their properties and abilities. Therefore, we construct a framework
of the functional hierarchy. We have searched for the related researches in Google Scholar
and Scopus. Google Scholar search result, 2 hits for the search terms: “cognitive service” AND
“cognitive function” AND “mapping”. In where the topics of the articles are as follows: (1)
Cognition and the web; (2) Pharm.Care@BLED Build - Lead - Engage–Disseminate. Scopus
the search result, 0 hits for the search terms: “cognitive service” AND “cognitive function”
AND “mapping”. Therefore, we present and map the functions of the cognitive services

© Ulla Gain. Published in Applied Computing and Informatics. Published by Emerald Publishing
Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) license. Anyone
may reproduce, distribute, translate and create derivative works of this article (for both commercial and
non-commercial purposes), subject to full attribution to the original publication and authors. The full
terms of this license may be seen at http://creativecommons.org/licences/by/4.0/legalcode
Publishers note: The publisher wishes to inform readers that the article “The cognitive function and
the framework of the functional hierarchy” was originally published by the previous publisher of
Applied Computing and Informatics and the pagination of this article has been subsequently changed.
Applied Computing and
There has been no change to the content of the article. This change was necessary for the journal to Informatics
transition from the previous publisher to the new one. The publisher sincerely apologises for any Vol. 16 No. 1/2, 2020
pp. 81-116
inconvenience caused. To access and cite this article, please use Gain, U. (2020), “The cognitive function Emerald Publishing Limited
and the framework of the functional hierarchy”, Applied Computing and Informatics. Vol. 16 No. 1/2, pp. e-ISSN: 2210-8327
p-ISSN: 2634-1964
81-116. The original publication date for this paper was 13/03/2018. DOI 10.1016/j.aci.2018.03.003
ACI (i.e., IBM’s Visual Recognition service, Microsoft’s Speaker Recognition and IBM’s Tone
16,1/2 Analyzer) into the human cognitive functions to illustrate the types of functionalities that can
be considered cognitive. The research steps and results are presented in the order in which
the study proceeds Figure 1.
The cognitive computing functions imitates the human mind cognitive functions.
Therefore, we present the examples of the human brain cognitive functions (Section 3).
Further (Section 4) we construct a framework, which is grounded in the human cognitive
82 functions, to categorize the cognitive computing functions. Thereafter we use these
categories to map the cognitive functions (Section 5). In Section 6, we represent the result of
the compare the similarities and differences between the cognitive computing functions and
the applications, which are not defined a cognitive (NDC) one.

2. Material and methods


On this paper, we focused on the neural systems functions, which are associated in the brain
structures. We used the cognitive functions of 3D Brain mobile application, which illustrate
and explain the brain structures with associated functions [1]. The 3D Brain contains
descriptions of the whole brain with 28 substructures and the associated functions. At first,

Figure 1.
Research steps.
we tabulated the 3D brain structures and each associated functions of each structure (see the Functional
Appendix A Table 2). Further, we drafted a graph view of this information. hierarchy
We used ‘3D Brain’ information to construct the graph view of the human brain. First, we
illustrated each lobe and their structures in the graph. Second, we added associated functions
to each structure and attached the link between them (i.e., the link between structure and
function). Third, we construct the links between structures by following the description
information. During the construction of the graph, we found that the functions form the
chains of processes for performing a task e.g., visual perception. Further some processes such 83
as language processes needed further descriptions. Therefore, we supplemented the
structures and associated functions of the brain presentation (see the descriptions of the main
processes). Further we grouped the brain functions according to the process, which forms
nine main processes. These groups are obtained from the intentional functions of the brain
and from the paths through the transition of the sensory stimuli into the cognitive processes
and further to the outcome. If the associated cognitive function of brain structure occurred in
more than one process, then we listed it in each participant’s process. These cognitive groups
of functions form a functional interactive hierarchy. Whereat, the associated cognitive
functions are grouped adjacent to their respective sensory stimuli (i.e., visual functions,
auditory functions, motor functions, sensation functions, homeostasis functions), further the
language, and emotion and behavior related multi-processes functions are presented, then
memory related functions are collected into memory functions and finally such as higher,
executive, complex cognitive functions are in cognitive functions. When we tried to compare
human cognitive functions with cognitive service functions, we found out that the extent of
the functionality is not the same, therefore we presented the mappings of the similarities
between the cognitive functions of the applications (i.e. cognitive services and NDC
applications) and the human cognitive functions. Finally, we used the functional hierarchy to
compare the similarities and differences between the cognitive functions in cognitive services
and the NDC applications.

3. About human brain structures and they underpin to the cognitive functions
The brain graph with connected structures and functions visualizes the parts of the brain.
From this sketched graph, we grouped the brain functions according to the process by nine
main processes: language processes, auditory processes, visual processes, sensation
processes, homeostasis processes, motor processes, emotion and behavior processes,
memory processes, and cognitive processes. The main groups are formed from the
intentional functions of the brain’s substructures. For example, language processes functions
contain all language related functions such as auditory, speech and written language
production and comprehension. These categories are obtained from the intentional functions
of the brain and from the paths through the transition of the sensory stimuli into the cognitive
processes and further to the outcome. In the next eight paragraphs represents the short
description of each of them (auditory processes are described in connection with language
processes):
1. Language processes functions is attached in many brain structures. The first cerebral
cortex to receive auditory input is the primary auditory cortex which is a substructure
of Superior Temporal Gyrus, located at the Temporal lobe. Brodmann areas 42 and 22
interprets the auditory stimuli and associations of auditory [2]. The Superior Temporal
Gyrus has functions for information processing of auditory and it uses a Tonotopic map
(i.e., the map of sounds and frequencies) and auditory memory [1]. Also, Broca’s area
attends to language processing, speech production, and comprehension, it is connected
with Wernicke’s area of temporal lobe via “nerve fibers bundles (i.e. Arcuate Fascilicus)” [3].
ACI Wernicke’s area processes, written and spoken language [4] and it involves in language
16,1/2 comprehension [1]. Besides the sound processing, Temporal lobe does speech processing
and includes Middle Temporal Gyrus which do word retrieval and language processing [1].
Parietal lobes involve in placing lips and tongue in the proper position for the purposes of
speak [5] and primary motor cortex initiates the lips and tongue movements [1].
2. Visual processes occipital lobe “receive and interpret visual information”, it involves
84 functions such as the reading and comprehension of reading [6]. Occipital Lobe is the
primary visual area, where projections are received from the retina via the thalamus and
through neurons, which encodes the information (e.g. color, orientation, and motion) [1].
Visual association area is in the Brodmann areas 18 and 19 that interprets visual stimuli
[2]. Further the pathways go from occipital lobes to temporal and parietal lobes. Parietal
lobe work with the visual cortex, it integrates information about the dorsal visual path
ways i.e., process ‘where the things are’ and ventral visual pathways which project to the
substructures of temporal lobes to identify ‘what things are’ [1]. The substructures of the
temporal lobes participate in visual perception (i.e., Inferior Temporal Gyrus) and
recognition e.g. recognizing and interpreting information about faces [1]. Furthermore
“temporal lobes help to connect the visual information received with memories” [6].
3. Sensations processes, next to a Primary Somatosensory receiving area, around the
borders, there are association functions of Somatosensory in Brodmann areas 5, and 7
that interprets Somatosensory stimuli [2]. Parietal lobes integrate information about
different senses, the main functions receive and process sensory information, for
example Somatosensory cortex in the parietal lobe participates to identify the location
of touch sensation [5]. Parietal lobes have an important role in Somatosensory
information perception and integration [1]. As it “contains many distinct reference
maps of body, near and distant space, and which are constantly updated” [1]. The
limbic system involves sensory perception and sensory interpretation [7], where
amygdala gets the sensory signals from thalamus and uses it to process the emotions
further hippocampus connects emotions and senses to memories [7] and participate in
early memory storage and formation of long term memory [1].
4. Homeostasis processes, the autonomic functions that are controlled by brain structures
such as Hypothalamus, Pons and Brainstem. Hypothalamus “is responsive to a variety
of stimuli including light (it regulates circadian rhythms), odors (e.g. pheromones),
stress, and arousal (hypothalamic neurons release oxytocin directly into the
bloodstream)” [1]. Hypothalamus controls functions such as hunger, thirst, body
temperature, perspiration, blood pressure, shivering, pupil dilation, circadian rhythms,
the sleep and heart rate. Also, brainstem participates in the blood pressure, sleep and
the heart rate controlling. The brainstem contains the following brain structures: Pons,
Medulla Oblongata, and midbrain. It also controls the functions such as perspiration,
digestion, the temperature and regulating breathing. The breathing and the circuits
that generate respiratory rhythms functionality are the mostly associated to Pons [1].
5. Emotion and behavior processes: the recent research of Kitamura et al. [8] explains that
the Engram cells of the Basolateral amygdala store both positive and negative
emotional events and they are important for the communication of emotions linked a
memory. Further Bergland [9] explains that “the amygdala acts as a type of emotional
relay between the hippocampus and prefrontal cortex”. Hypothalamus participates to
regulate emotional responses “elicited by sensory information through the release of
hormones that act on the pituitary gland in response to stress [7]”. Recent research
Manninen et al. [10] “Social laughter increased pleasurable sensations and triggered
endogenous opioid release in Thalamus, Caudate nucleus, and Anterior Insula”. Functional
Further the relationship between the opioid receptor density and rate of laugh may hierarchy
indicate that the opioid system underlies our differences in sociability. Our emotions
are recognizable in our comportment, they are manifest in the behavioral patterns of
facial expressions and in autonomic arousal [11]. Furthermore Dolan [11] explains the
global effects on all aspect of cognition, such as emotional influence on perception and
attention, and subjective feeling states. Emotional experience in brain regions involves
the Insular cortex, Orbitofrontal cortex, and anterior and Posterior Cingulate Cortices 85
[11]. Baley [12] explains that prefrontal cortex in frontal lobes “is responsible for
personality expression” and that the frontal lobes contribute in the functions which
form our individual personalities. The prefrontal cortex is associated to voluntary
behavior e.g., decision-making, planning, problem-solving, and thinking and to
personality and emotion by evaluating and controlling appropriate social behavior and
inhibition [1]. Further the social skills and personality traits are associated in frontal
lobes and substructures of the limbic system such as Cingulate Gyrus is associated in
regulating emotions, amygdala is participating in fear processing, emotion processing,
the fight or flight response [1]. The substructures of temporal lobes participate in
emotional reactions and the forebrain structure, basal ganglia, contributes in emotional
behaviors, reward and reinforcement, habit formation, and addictive behaviors [1].
Posture function is associated with basal ganglia, Pons and cerebellum [1]. Further
cerebellum participates in the automatic movements of motor behavior [1].
6. Motor processes, the primary motor cortex is associated to the coordination and initiation
of motor movements [1]. The primary motor cortex area is divided into specific body
parts, each body parts’ cell density and the surface area differs for example the “arm
hand motor area occupies the most space in the motor cortex (unsurprising given their
importance to human behavior)” [1]. The primary motor cortex is a substructure of the
motor cortex; in addition, the motor cortex contains Premotor cortex and supplementary
motor area [13]. The Premotor cortex participates in limb movements preparation and
execution. To choose right movement it uses information on the other cortical regions.
Also, it participates in learning (in the form of imitation) and social cognition functions (in
the form of empathy) [1]. Supplementary motor area participates in “selecting movements
based on the remembered sequences of movements”, the mental rehearsal of bilateral
movements, and transformation of kinematic information to dynamic [13]. The thalamus
participates in relaying the motor and sensory information between, the cortex, the
cortical structures and brain stem. As we also express ourselves in the body language,
emotions, for example, appear in our countenance and gestures as well as involuntary and
voluntary. “Voluntary movements require the participation of the motor cortex and
association cortex”, however the association cortex does not belong to the motor areas, it
is still necessary for the adaption of the movement and adapt for appropriate to the
behavioral context [13].
7. Memory processes: Poo et al. [14] the “synapses are the basic units of information
storage”. The amygdala is a substructure of the limbic system and it actuates i.e., “what
memories are stored and where the memories are stored in the brain” [7]. Dolan [11]
describe that amygdala participates in episodic memory encoding, and retrieval of
emotional context and items and it has critical the role for fear conditioning which is a
form of emotional memory (i.e., implicit memory). Further amygdala attends to
learning of conditioned and unconditioned associations [11], for example associative
learning such as reward and appetitive learning. Along with the other the limb system
substructures (Cingulate Gyrus, Dentata Gyrus and Entorhinal cortex), amygdala
ACI participates in memory formation [1]. Moreover, the Entorhinal cortex has an
16,1/2 important role in memory formation, thus it preprocesses the memorable information
for hippocampus and provide the main input area lateral and medial to the
hippocampus [1]. This information stream (i.e., Perforant path) goes from Entorhinal
cortex through Dentate Gyrus to the hippocampus [1]. The Dentate Gyrus is one, the
rear regions where the development of new neurons has been confirmed [1]. The
Dentate Gyrus “may play an important role in translating complex neural codes from
86 cortical areas into simpler code that can be used by the hippocampus to form new
memories” [1]. The hippocampus is most closely joined brain structure to the memory
formation. It is an early storage place for long-term memory and it involves “in the
transition of long-term memory to enduring memory” [1]. It also has an important role
in spatial navigation [1]. The latest research result of Kitamura et al. [8] also indicates
that hippocampus is an early storage place for long-term memory and the memories are
initially established at the same time in the hippocampus and the Engram cells of the
prefrontal cortex (i.e., specialized neurons, which consolidate long-term memories as time
passes). According to the research the prefrontal cortex, hippocampus and Basolateral
amygdala participate simultaneously in early memory formation until the memory is
consolidated in the prefrontal cortex. Bergland [9] further explains that the technological
breakthrough of Tonegawa lab “allowed them to label specific Engram cells in various
parts of the brain that contained specific memories” and further this allowed the scientists
to trace the brain circuits participating in memory formation, storage and retrieval.
Subiculum is associated to function such as memory processing, it is in output area of
hippocampus (in temporal lobes) and is important to learning and memory [1]. The
Superior Temporal Gyrus is the substructure of the temporal lobe and it is associated to
auditory memory functions [1]. The memory acquisition is a function associated in the
temporal lobes [1]. The Perirhinal cortex has “an important role in storing information
(memories) about the objects” [1]. Bechara et al. [15] locates the working memory in the
frontal lobe. Cerebellum participates in reflex memory function [1].
8. Cognitive processes and cognition, “all cognitive functions result from the integration
of many simple processing mechanisms, distributed throughout the brain” [1]. The
following brain structures participate in cognition: Baley [12] explains that prefrontal
cortex in frontal lobes “is responsible for personality expression and the planning of
complex cognitive behaviors” such as reasoning, problem solving. Further she
explains that the frontal lobes contribute in the functions such as judgement, and it
“help us set and maintain goals, curb negative impulses and form our individual
personalities” [12]. The cerebellum represents about 80 percent of the total neurons of
the human brain and most of them are granule cells [16], still it has been considered
occupying unconscious activities until recently. Bergland [16] presents the novel
neuroscience studies of cerebellum [17–21], which indicate that cerebellum and the
other subcortical brain structures such as basal ganglia participate in various cognitive
processes. The cerebellum in the ‘3D Brain’ [1] is associated with cognitive functions such
as the sequence learning and the motor learning and attention. “Pathways from occipital
lobes to temporal and parietal lobes are eventually processed consciously” [1]. The
parietal lobe [5] processes attentional awareness of the environment [1]. The substructure
functions are associated with the perception and integration of Somatosensory
information and it participate in manipulating objects and number representation [1].
Middle and Inferior Temporal Gyrus participates in many cognitive processes such as
semantic memory processing, language processes, visual perception [1]. The Perirhinal
cortex has “important role in object recognition” and it has many connections to others
brain structures. These connections “allow it to specialize in associating objects with
sensory information and potential consequences” [1]. The substructures of the temporal Functional
lobes participate in perception, face and object recognition, emotional reactions, language hierarchy
understanding, learning and memory functions. For example, the Superior Temporal
Gyrus and Wernicke’s area wherein, “is the major area involved in the comprehension of
the language” [1]. Substructures of the limbic system such as Cingulate Gyrus are
associated in regulating emotions and processing smells, amygdala is participating in
fear processing, emotion processing, learning, the fight or flight response, reward
processing. The basal ganglia also contribute in emotional behaviors, reward and 87
reinforcement, habit formation, and addictive behaviors [1]. Amygdala contributes in
linking perception with memory and automatic emotional responses [11]. Thalamus
participates in relaying information between the brain stem and the cortex also with
other cortical structures [1]. This role in cortico-cortical interactions involves many brain
processes such as perception, attention, timing, alertness, consciousness, and it
contributes to perception and cognition [1]. Frontal Lobe “is the main site of so-called
‘higher’ cognitive functions” [1]. The substructures of the frontal lobe contribute in
thought, decision-making, planning, problem solving (i.e., voluntary behavior), cognition,
intelligence, attention, language processing and comprehension, etc. [1]. Particularly, the
prefrontal cortex contributes in higher brain functions. It participates in the executive
system such as judgement, reasoning, planning and it also participate in personality and
emotion by evaluating and controlling appropriate social behavior [1]. Also, the cerebral
cortex’ Somatosensory, visual and auditory association areas participate in higher
cognitive and emotional processes such as memory, learning, the interpretation of
sensations and speech, they are connected, except primary areas, to each other and with
Neothalamus [2]. Further three unimodal association areas (limbic, posterior and anterior
association area) participate in association and executive processing. They “are adjacent
to their respective primary sensory cortical areas”, especially association functions, the
farther they come from the primary sensory area, more general the functions are [22]. The
limbic association area “links emotion with many sensory input and is important in
learning and memory”, the posterior association area “link information from primary
and unimodal sensory areas and it is important in perception and language”, and the
anterior association area “links information from other association areas and is
important in memory, planning and higher-order concept formation” [22]. In addition to
the foregoing, Wang et al. [23] lists the following in the higher cognitive processes:
recognition, imagery, learning, deduction, induction, explanation, analysis, synthesis,
creation, analogy and quantification. Working memory is closely associated with
cognitive functions, e.g. the resent study on journal Frontiers in Aging Neuroscience,
research present challenging cerebral tasks can improve cognitive functions related to
working memory such as complex reasoning, processing speed, abstract thinking
[24,25]. Bechara et al. [15] consider working memory to be part of the cognitive functions
of the frontal lobe.

4. Framework to categorize the cognitive functions


According to Wright [22], the brain functions which are localized to the specific region of the
brain, have considerable clinical importance, since the localization of the function can explain
“why certain syndromes are characteristic of disease in specific brain regions”. However,
“no part of the brain works in isolation. Each and every part of the brain works in concert with
every other part. When a part of the brain is removed, the resulting behavior may reflect more
about the adjusted capacities of the remaining “parts” than the removed part” [22].
Accordingly, our interest in this paper corresponds the brain functions and based on the
Section 3 descriptions we tabulated the brain functions in nine groups: Language functions,
ACI Auditory functions, Visual functions, Motor functions, Homeostasis functions, Emotion and
16,1/2 Behavior functions, Sensations functions, Memory functions, Cognitive functions. These
categories are obtained from the intentional functions of the brain and from the paths through
the transition of the sensory stimuli into the cognitive processes and further to the outcome.
These categorical groups, are composed from bottom to up i.e., from the basic cognitive brain
function to higher level cognitive function. As described in Section 3, “all cognitive functions
result from the integration of many simple processing mechanisms” [1] and like three unimodal
88 association areas (i.e., limbic, posterior and anterior association area) participate in
association and executive processing and they “are adjacent to their respective primary
sensory cortical areas”, especially association functions, the farther they come from the primary
sensory area, more general the functions are [22], hence the functionality could be represented
as a slice of the categories (i.e., from top to down (from outcome to input) or bottom to up (from
input to outcome) as desired). A slice of the functional categories could be further divided per
the cognitive degree (e.g., cognitive function, higher cognitive functions, executive cognitive
functions, complex cognitive functions). Further the nature of the cognitive functions (i.e., the
result of the integration of many processing mechanisms) leads to a common function
involved in multiple processes is listed in each participants’ process. For example, facial
expressions, empathy, imitation functions are listed in both motor and emotion and behavior
processes. The cognitive groups of functions are presented in Figure 2.
All processes and functions represented herein are described in Section 3. The processes
represented in the dashed line describe unlimited interactions between the processes. The result of
a cognitive function is the result of multi-functional cooperation, for example, the functionality of
language processes, it is important to co-operate with other processes, such as visual information,
auditory information and motor processes. Figure 3 is an example of the functions of language
processes, where e.g., we cannot speak well without the movements of lips and tongue.
The cognitive groups of functions form a functional interactive hierarchy Figure 4. The
hierarchy structure of cognitive functions underpins the functions of the structures of the brains.
These categories are obtained from the intentional functions of the brain and from the paths
through the transition of the sensory stimuli into the cognitive processes. The groups of the brain
functions are formed according to the process by nine main functional hierarchies: visual
functions, auditory functions, motor functions, sensation functions, homeostasis functions,
language functions, emotion and behavior functions, memory functions, and cognitive functions.

5. The comparison of the cognitive functions


We have chosen cognitive service examples for human and cognitive computing functions
comparison the chosen cognitive services have online demonstrations whose cognitive
functionality can be evaluated by experimenting. McKiney Global Institute analysis,
(i.e., Exhibit 4p.37) have for example found the social, cognitive and physical patterns of
capabilities, which are often required together to support many activities, hence our chosen
functionality examples of cognitive services and NDC applications are examples of these
capabilities [26]. The examples are following: IBM Visual Recognition [27], MS Speaker
Recognition [28], and IBM Tone Analyzer [29]. Cognitive services will be described along side
with the comparisons.
First, we compare the human and computing cognitive functions, in Figure 5 we represent
an example of visual recognition in the perspective of the human cognitive functions (upper
part of the figure) and below it the cognitive computing functions of IBM Visual Recognition
service (lower part of the figure). Further we describe the main similarities from the
underlying material basis.
Many human cognitive processes integrate and interpret the information. As described in
Section 3. The human cognitive functions are interactive and simultaneous. The required
Functional
hierarchy

89

Figure 2.
Example of cognitive
groups of functions.
ACI
16,1/2

90

Figure 3.
Example of Language
processes functions.

Figure 4.
Cognitive functions
functional hierarchy.

human cognitive functions largely depend on the content of the image, for example, if the
image depicts the IKEA furniture assembling guide, then normally the number of required
cognitive functions is greater than if the image is a football. In this example (Figure 5) selected
cognitive functions are for multiple images. For example, human can detect feeling states and
action from the image such as facial expressions, the image of the psychological test human
need to use imagination to recognize the image, or in images which contains text we need
language processes functions such as the comprehension of reading to detect the message.
Whereas the IBM Visual Recognition service analyzes the images for faces, objects with
the deep learning algorithms [30]. The deep learning algorithms are the type of backward
chaining neural nets that “uses learning algorithms to progressively infer a pattern about a
body of data by starting with the goal and then determining the rules that are inferred, which
can then be used to reach other goals” [31]. The service can identify food, and colors,
categorize and tag the image, detect the face, age, gender, and for the celebrities it returns the
identity, knowledge graph and name. Further the service can be trained to create custom
classifiers [32]. The results are represented with a score (i.e., score range from 0 to 1 higher
score equal greater correlation). For example, the image of IKEA’s [33] the assembling guide
gives an outcome (Figure 6 on the left side) a ‘study’ class with a score 1.0 and colors gray
(0.87) and olive green (0.86).
Functional
hierarchy

91

Figure 5.
The examples of the
cognitive functions of
human visual
recognition and the
IBM Visual
Recognition service.
ACI
16,1/2

92

Figure 6.
The examples of IBM
Visual Recognition
service outcome of the
assembling guide
image and the image of
the product, and an
image of the face.
We tested the service also with an image of the same product (Figure 6 in the middle), the Functional
image of the product is classified in a loudspeaker, electrical device and device classes with a hierarchy
score 0.69. The second-best scores are in ‘a portfolio and a file folder’ classes (0.60).
Furthermore, service results in the types of hierarchy: /electrical device/loudspeaker,
/recorder/black box, /electrical device/loudspeaker/subwoofer. The example of the face image
(Figure 6 on the right side, [34]) is classified as a person (0.73), President of the United States
(0.55), reddish orange color (0.67). Further service identifies male (1.00) face age between 55
and 64 (0.56). To improve the result, the service needs to have a customized classifier and it 93
needs training (the positive images and negative images of the subject).
We realized that the extent of the functionality of visual recognition of human and
cognitive computing is not the same, therefore we chose to compare cognitive computing with
human cognitive functions. The same manner (i.e., compare computing cognitive versus
human function), the comparison of cognitive functions of cognitive services will be
presented in Section 5.1 and comparison of the cognitive functions of NDC applications in
Section 5.2.

5.1 The cognitive functions in cognitive services versus human functions


In the following, we illustrate the main functionality of each cognitive service alongside to the
similar functions of the human as follow: IBM Visual Recognition, MS Speaker Recognition,
and IBM Tone Analyzer. The functional similarities are presented in order: first the
computing and then in the brackets [ ] the similar functionality in human cognitive functions.
5.1.1 Visual recognition. The following describes the similarities between IBM Visual
Recognition and human cognitive functions (Figure 7). The service receives the image either
through the link or drop image function, and processes it [receive sensory information;
processing sensory information]. The neural network uses the inferred rules to categorize the
image (i.e., the “learning algorithms to progressively infer a pattern about a body of data by
starting with the goal and then determining the rules that are inferred, which can then be used to
reach other goals” [30].
The rules for the service have been inferred during the training phase (i.e., when it has
been trained to recognize certain type of images). In the training phase, neural net uses the
training sets of images which it processes to infer the patterns and to form the rules [visual
memory, memory retrieval, working memory, memory processing, memory acquisition
(i.e., the acquisition of the rules), learning]. The acting service categorizes the inputted image

Figure 7.
The example of
similarities of the IBM
visual recognition
service example and
human functions.
ACI and presents the type of hierarchy (i.e., knowledge graph) [information processing, semantic
16,1/2 memory processing and the higher order concept formation, visual perception] and for
classification (i.e., in order to be accepted to the class) the inputted image needs to comply with
the rules [the interpretation of visual information, integrate information (where and what)].
The correlation scores are calculated [i.e., quantification], the results are found, with the
respect of the threshold and the image is categorized. The service analyzes and returns the
color scheme [encoding of the color]. The items on the image are identified based on
94 the categories and image tag with the score is returned [recognition; object and face recognition;
number representation]. The service marks the recognized face in square [spatial mapping]. In
the functional hierarchy, the similarities are mostly in visual functions, however for the
functionality of the service also needs sensation, memory and upper cognitive functions.
5.1.2 Speaker recognition. The MS Speaker Recognition service does the verification and
identification of a speaker. It uses functions ‘verify’ and ‘identify’ with the verification and
identification of a speaker [35]. Speaker verification underpins on the unique characteristics
of the voice (i.e., unique voice signature), which is used to identify a person. Further ‘speaker
identification’ compares the audio input (i.e., voice) with the provided group of speakers’
voices and if the match of the voice is found then the speaker’s identity is returned. Finding
the match of the voice requires a pre-registration of the voice signature of speaker.
The service (Figure 8) receives the audio input through a microphone [receive sensory
information]. The process is divided into two parts, the first of which does speaker
verification and the second is speaker identification. Speaker verification underpins on the
unique characteristics of the voice (i.e., unique voice signature), which is further used to
identify a person. The unique voice signature creation (i.e., enrollment) precedes the
successful speaker recognition. In the voice signature enrollment part, the system extracts the
predefined features from the audio example and creates the model [36,35] [Processing sensory
information, Sensory interpretation, Interpretation of the audio stimuli, Auditory information
processing, Sound processing, Speech processing, Language processing, Tonotopic map
functions, Processing combination of frequencies, the Processing changes in amplitude or
frequencies, Personality expression, Analysis, Synthesis, Creation, the learning of unconditioned
associations, Association, Learning, Auditory map, Auditory memory, Working memory,
Preprocessing the memorable information, Memory processing, Memory functions, Memory
acquisition, Memorizing, Connecting senses to memories, Choosing the memories to stored,

Figure 8.
The example of
similarities of the MS
Speaker recognition
service example and
human functions.
Choosing the place where memories are stored, Memory formation, New memory formation, Functional
Memory storage]. hierarchy
In the verification, Speaker Recognition service extracts the features from the audio input,
[Receive sensory information, Processing sensory information, Sensory interpretation,
Interpretation of the audio stimuli, Auditory information processing, Sound processing,
Tonotopic map functions, processing the combination of frequencies, processing changes in
amplitude or frequencies, Speech processing, Language processing, Personality expression,
Analysis, Memory processing, Memory retrieval, Working memory, Auditory memory ]. In 95
the speaker identification part, extracted features are matched against the model [Analogy,
Decision making, Recognition]. The similarities in the functional hierarchy and principal
basic functions are in the auditory information, language, and emotion and behavior
functions hierarchy.
5.1.3 Tone analyzer. IBM Tone Analyzer service uses unsupervised learning algorithm
Glove to detect social tones from textual input. The Tone Analyzer extracts the tones of the
text as an emotion such as anger, disgust, fear, joy, sadness; language style such as analytical,
confident, tentative; and social tendencies such as openness, consciousness, extraversion,
agreeableness, emotional range [37].
IBM Tone Analyzer service (Figure 9) detects tones from textual input. It uses
unsupervised learning algorithm Glove to detect for social tones (i.e., it tokenizes the text and
obtains the vector representation from the words of the text and further processes it in
purpose to generate the meanings) [38] more in detail, see [39]. [Receive sensory information,
Processing sensory information, Word retrieval, Written language processing, Language
processes, Language processing, Information processing, Reading functions, Semantic
memory processing, Learning of conditioned associations, Learning, the comprehension of
complex syntax, the comprehension of reading functions, Language understanding,
Language memory processing, Executive processing, Association, Recognition, Abstract
thinking, Deduction, Analysis, Analogy, Quantification, Cognition, Memory, Memory
functions, Memory retrieval, Memory processing, Working memory, Choosing the
memories to stored, Choosing the place where memories are stored, Preprocessing the
memorable information, Memory formation, Memory storage, Memory acquisition,
Memorizing, New memory formation, Number representation].
The Glove model, can be used to output the analogy of the words, and the scoring of it can
further be compared with the human judgments [38]. The Tone Analyzer processes the
results of Glove in machine learning algorithms, in which it extracts the Big Five
(i.e., personality model, which higher level dimensions are: openness, conscientiousness,
extraversion, agreeableness, emotional range), Needs and values characteristics. The service
uses the model, which is trained against ground truth data (i.e., corpora) [Learning]. The Tone
Analyzer extracts the tones of the text as follow [37]:
– Social tendencies are openness, conscientiousness, extraversion, agreeableness, emotional
range. The service describes the social tendencies in context see Appendix B. [Reading
functions, Information processing, Language processes, Written language processing,
Language processing, Language memory processing, Semantic memory processing, the
Learning of conditioned associations, the Comprehension of complex syntax, Analysis,
Recognition, Interpretation of sensations, the Comprehension of reading functions,
Language understanding, Language memory processing, Executive processing,
Sociability, Inhibition, Empathy, Emotion processing, Emotional memory, Feeling
states, Posture, Personality, Personality expression, Personality traits, Emotion,
Emotion processing, Association, Higher-order concept formation, Abstract thinking,
Deduction, Analogy, Reasoning, Synthesis, Quantification, Number representation,
Connecting emotions and senses to memories, Cognition, Memory, Memory functions,
ACI
16,1/2

96

Figure 9.
The example of
similarities of the IBM
Tone Analyzer service
example and human
functions.
Memory retrieval, Memory processing, Choosing the memories to stored, Choosing the Functional
place where memories are stored, Preprocessing the memorable information, Memory hierarchy
formation, Memory storage, Memory acquisition, Memorizing, New memory formation.]
– The emotions are anger, disgust, fear, joy, sadness. Emotional tones are revealed
through the stacked generalization-based ensemble framework, where higher level
model combines lower level models to gain better predictive accuracy. The machine
learning algorithm classifies the input features with sentiment polarity to emotion 97
categories. The service describes the emotions in context see Appendix B. [Receive
sensory information, Working memory, Processing sensory information, Sensory
interpretation, Word retrieval, Reading functions, Information processing, Written
language processing, Language processing, Language memory processing, Semantic
memory processing, Analysis, Recognition, Interpretation of sensations,
Comprehension of reading functions, Emotion, Emotion processing, Emotional
memory, Feeling states, Interpretation of sensations, Learning of conditioned
associations, Association, Higher-order concept formation, Abstract thinking,
Deduction, Association, Analogy, Reasoning, Synthesis, Quantification, Number
representation, Connecting emotions and senses to memories, Choosing the
memories to stored Choosing the place where memories are stored, Preprocessing
the memorable information, Memory formation, Memory storage, Memory retrieval,
Memory processing, Memory acquisition, Memorizing, New memory formation]
– Language style such as analytical, confident, tentative. Language tones are formed labels
based on ground-truth data (i.e. human labeled sentences, positive and negative sentence
samples and few sentences which are not understandable). In practice machine learning
classify the sentences and documents with the help of the ground-truth labels. The service
describes the language styles in context see Appendix B. [Receive sensory information,
Working memory, Processing sensory information, Sensory information interpretation,
Word retrieval, Reading functions, Information processing, Language processes, Written
language processing, Language processing, Language memory processing, Semantic
memory processing, the Learning of conditioned associations, the Comprehension of
complex syntax, Analysis, Recognition, Interpretation of sensations, the Comprehension
of reading functions, Language understanding, Language memory processing, Executive
processing, Inhibition, Personality expression, Posture, Language understanding,
Association, Higher-order concept formation, Abstract thinking, Deduction, Analogy,
Reasoning, Synthesis, Quantification, Number representation, Cognition, Memory,
Memory functions, Memory retrieval, Memory processing, Choosing the memories to
stored, Choosing the place where memories are stored, Preprocessing the memorable
information, Memory formation, Memory storage, Memory acquisition, Memorizing, New
memory formation].
The service returns each tone and a score, where 0.5 or less indicates in the emotion and
language style analysis that tone is ‘unlike to be detected’ in the contents, where greater than
0.75 indicates ‘higher like hood’ [Number representation, Quantification]. The social
tendencies describe the content with a score less than 0.5 are more likely to be perceived.
Agreeableness tone for example: is selfish and higher than 0.75 is caring (in more detail see:
[40]). The similarities in the functional hierarchy and the principal basic functionalities are in
the visual information, language, and emotion and behavior functions hierarchy.

5.2 The NDC application versus human functions


The NDC application is an application that is not defined as cognitive, which is a
programmable system that has rules and predetermined processes for producing results
ACI (i.e. the outcome of a non-cognitive platform) [41]. The focus of this study is to describe, what
16,1/2 types of computing functions could be considered cognitive, therefore we the first compare
the NDC application with human cognitive functions and thereafter the analyze the
differences and similarities between the NDC and cognitive application (Section 7).
We have chosen the NDC applications which have available online descriptions. The chosen
applications are: Apple Watch Activity app [42] and Apple iPhone Parked Car service [43]. The
functional similarities are explained, in the following order: first the computing functionality and
98 then the similar functionality in human cognitive functions are listed in the brackets [ ].
5.2.1 Activity app. Apple Activity app on Apple Watch (Figure 10) monitors and help user
to reach the preset goal of movement. Application tracks the user movement in three ring
categories: Move, Exercise and Stand ring. Move ring measures the calorie consummation,
exercise ring measure the energetic activity time (default goal is 30-min exercise per day) and
stand ring measures standing time (standing up and moving, at least 1 min within 12 h a day)
[42] [Receive sensory information, Processing sensory information, Sensory interpretation,
Information processing, Preprocessing the memorable information, Working memory].
The service gives reminders [Attention, Timing], progress updates, goal completions
[Memory processing, Analysis, Analogy, Quantification] and achievements [Maintain goals,
Synthesis]. The reminders, progress updates and completion information can be set on/off.
The activity results of each day are saved on application [Memorizing, New memory
formation, Memory storage] and user can browse the results [memory retrieval]. The
similarities in the functional hierarchy and the principal functionalities are in the cognitive,
memory, and the sensation functions hierarchy.
5.2.2 Parked Car. Apple iPhone Parked Car service works with Maps. In the example
(Figure 11) of parked car service, only car parking information part is presented, even if the
service also performs other functions such as make reservations request for Uber.
The car parking functions, searches the current location of the car (i.e., latitude and
longitude) when iPhone disconnects from CarPlay or Bluetooth, it shows the message to a

Figure 10.
The example of the
similarities of the
Activity app example
and human functions.
Functional
hierarchy

99

Figure 11.
The example of the
similarities of the Car
Parked (parking
functionality) and
human functions.
ACI user and saves the location in the database. Further it shows the location for the user in the
16,1/2 Map application. When needed, the Map application gives the directions which guide the user
to the car [43]. [Receive sensory information, Working memory, Processing sensory
information, Sensory interpretation, Integrate information (Where the things are), Integrate
information (What the things are), Preprocessing the memorable information, Information
processing, Association, Higher-order concept formation, Memory formation, Memory
storage, Choosing the memories to stored, Choosing the place where memories are stored,
100 Spatial mapping, Memorizing, New memory formation, Memory, Memory processing,
Memory functions, Memory retrieval, Reference map functions of near and distance space,
Executive processing, Spatial navigation.] The similarities in the functional hierarchy and the
principal functionalities are in the memory, cognitive, sensations, and visual information
functions hierarchy.

6. Results and discussion


In the following, we analyze the differences and similarities between the NDC application and
cognitive service results. Functionalities of applications are not directly comparable to each
other, because they perform different tasks. Still, we can compare their cognitive
functionalities in a scale of human cognitive functions. Table 1 summarizes the
comparison of cognitive functions as follow:
IBM Visual recognition service uses 13 Cognitive functions in the cognitive functions
hierarchy, 5 Memory functions, 9 Visual and Sensation functions i.e. hierarchy levels IV, III
and I. MS Speaker Recognition uses 16 Cognitive functions, 5 Memory functions, 3 Language
– and Emotion and Behavior functions, 11 Auditory information and Sensation functions
i.e., there are functions from the hierarchy levels IV, III, II and I. Further, Tone Analyzer uses
32 cognitive functions, 16 Memory – 12 Language – and Emotion and Behavior functions,
and 5 Visual and Sensation functions i.e., the functions from the hierarchy levels VI, III, II and
I have been used. From the NDC applications: Apple Activity app uses 11 Cognitive functions,
7 Memory functions, and 3 Sensation functions i.e., hierarchy levels VI, III, and I. Parked Car
service uses 8 Cognitive functions, 13 Memory -, and 6 Visual and Sensation functions i.e.
functions from hierarchy levels IV, III and I are in use. The comparison of the cognitive
functions Figure 12.
Cognitive applications use about 70% of the example of the hierarchy of cognitive
functions at levels I, III and IV, compared with NDC- applications, which uses about 30%.
In contrast at level II (i.e., language - and Emotional and Behavior functions) there is no
NDC-applications.
The content analysis of cognitive functions descriptions reveals the differences in
functional types. Cognitive services use machine learning algorithms for data processing that

Hierarchy levels I % II % III % IV %

IBM Visual recognition service 9 15% 0 0% 5 14% 13 16%


MS Speaker Recognition 11 18% 3 10% 15 42% 16 20%
IBM Tone Analyzer 5 8% 12 39% 16 44% 32 41%
Cognitive Applications 25 74% 15 100% 36 64% 61 76%
Activity app 3 5% 0 0% 7 19% 11 14%
Car Parked 6 10% 0 0% 13 36% 8 10%
NDC Applications
P 9 26% 0 0% 20 36% 19 24%
Table 1. Frequency 34 15 56 80
Comparison of Cognitive functions in each class 62 30% 31 15% 36 17% 79 38%
cognitive functions. Cognitive functions represented 208 16% 7% 27% 38%
Functional
hierarchy

101

Figure 12.
The comparison of
cognitive functions at
each level of hierarchy.

reveals a difference between implementation techniques in these applications. Furthermore,


the cognitive services in cognitive platforms are implemented in the neural network which
imitates the human nervous system. We illustrate these differences in the Appendix B (see the
Appendix C Figures 13 and 14 and Tables 3 and 4), wherein the cognitive service functions
belong to the Visual, Auditory, Motor, Sensation and Homeostasis functions (23%),
Language functions, Emotion and behavior functions (32%), Memory functions (21%),
cognitive functions hierarchies (24%). Compared to the cognitive functions of NDC
applications, that are divide into three cognitive function hierarchies: Memory functions
(41%); Visual, Auditory, Motor, Sensation and Homeostasis functions (31%); Cognitive
functions (28%). Language - and Emotion and behavior functions (0%) (i.e., level II) of
hierarchical cognitive functions are not in use.
Comparison of cognitive function descriptions reveals differences between functionalities
in Cognitive - and NDC applications (i.e., the functionalities which are included in Cognitive
applications but they are missing in NDC applications). Examples of differences, as follows:
Language functions (e.g. language understanding); Emotion and behavior functions (e.g.,
Emotion processing), Memory functions (e.g., Learning), cognitive functions hierarchies (e.g.,
ACI

102
16,1/2

from [1]).
Table 2.

them and the


Brain Structures, the

descriptions (adapted
associated function of
Structure Description Substructure Associated functions

Whole Brain (Frontal Lobe, All cognitive functions result from the integration Arousal, emotion, language, learning,
Parietal Lobe, Temporal Lobe, of many simple processing mechanisms, memory, movement, perception,
Brainstem, Cerebellum. distributed throughout the brain. The outer layer sensation, thinking, many others
Occipital Lobe, etc.) of the forebrain constitutes the familiar wrinkled
tissue that is the cerebral cortex, or cortex. The
large folds in the cortex are called gyri. The small
creases within these folds are fissures. Each
hemisphere of the cortex consists of four lobes-
frontal, parietal, temporal and occipital. Other
important structures are the brainstem,
cerebellum the limbic system (which includes the
amygdala and hippocampus. Associated with
damage: it is possible for the brain to repair
damaged neural networks or to compensate for
the loss of function in particular structures.
Common impairments resultants from brain
damage include deficits in attention, emotion,
language, learning, memory, movement,
perception and sensation

Amygdala The amygdala has three functionally distinct Fear-processing, emotion processing,
parts- learning, fight or flight response, reward-
(1) the medial group of subnuclei has many processing
connections with the olfactory bulb and olfactory
cortex, (2) the basolateral group has extensive
connections with the cerebral cortex, particularly
the orbital and medial prefrontal cortex and (3) the
central and anterior group of nuclei has many
connections with the brainstem hypothalamus
and sensory structures

(continued )
Structure Description Substructure Associated functions

Basal Ganglia It is a group of structures that regulate the Caudate nucleus, Globus pallidus, Movement regulation, skill learning,
initiation of movements balance, eye movements, nucleus accumbens, putamen, habit formation, reward systems
and posture. They are strongly connected to other substantia nigra, subthalamic nucleus
motor areas in the brain and link the thalamus
with the motor cortex. The basal ganglia are also
involved in cognitive and emotional behaviors
and play an important role in reward and
reinforcement, addictive behaviors and habit
formation
Cingulate Gyrus An important part of the limbic system, it helps Emotion, pain processing, memory, self-
regulate emotions and pain. Is thought to directly regulation
drive the body’s conscious response to unpleasant
experiences. In addition, it is involved in fear and
the prediction (and avoidance) of negative
consequences and can help orient the body away
from negative stimuli
Corpus Callosum Consists of a large bundle fibers connecting the Splenium Connects right and left hemispheres and
right and left hemispheres of the brain. Each allow information to pass between them
hemisphere controls movement in the opposite
(contralateral) side of the body and can also
specialize in performing specific cognitive and
perceptual functions. It allows information to
move between hemispheres and is a very
important integrative structure M. Gazzaniga,
Wolcott Sperry R 1960s and Turk et.al. 2002 e.g.,
face recognition

(continued )
hierarchy

103
Functional

Table 2.
ACI

104
16,1/2

Table 2.
Structure Description Substructure Associated functions

Dentate Gyrus Associated to cognitive disorders; The Memory formation, possible role in
hippocampal formation has three regions, which memory recall
are highly interconnected: the dentate gurys, CA3
and CA1. It is one of the very few regions in the
brain where adult neurogenesis (development of
new neurons) has been confirmed. The dentate
gyrus may play an important role in translating
complex neural codes from cortical areas into
simpler code that can be used by the
hippocampus to form new memories
Entorhinal Cortex It plays major role in memory formation. Two Declarative memory, spatial memory,
majors’ connections from the entorhinal area self-localization
(lateral and medial) provide the main input to the
hippocampus and are important to preprocessing
memorable information. The lateral input stream
is thought to convey spatial information to the
hippocampus, while the medial input stream
conveys non-spatial information. The stream of
information from the entorhinal cortex, through
the dentate gyrus, to the hippocampus is called
the perforant path. Memory formation,
preprocess the memorable information provides
input to the hippocampus

(continued )
Structure Description Substructure Associated functions

Frontal Lobe (i.e., Frontal Eye Frontal lobes are part of the cerebral cortex and Frontal Eye Fields, Premotor Cortex, Executive processes (voluntary behavior
Fields, Premotor Cortex, are largest of the brain’s structures. They are the Primary Motor Cortex, Broca’s Area, e.g., decision-making, planning, problem-
Primary Motor Cortex, Broca’s main site of so-called ’higher’ cognitive functions. Prefrontal Cortex, Orbitofrontal solving, and thinking), voluntary motor
Area, Prefrontal Cortex) The frontal lobes contain a number of important cortex, Middle frontal gyrus, Inferior control, cognition, intelligence, attention,
substructures, incl. the prefrontal cortex, frontal gyrus language processing and comprehension
orbitofrontal cortex, motor and premotor cortices, and many others.
and Broca’s area. These substructures are
involved in attention and thought, voluntary
movement, decision-making, and language.
Associated for example to atypical social skills
and personality traits, when damaged
Hippocampus Is an early storage place for long-term memory Early memory storage, formation of long-
and transition from there to more enduring term memory, spatial navigation
permanent memory; is the structure in the brain
most closely aligned to memory formation. It is
important as an early storage place for long term
memory, and it is involved in the transition of
long term memory to even more enduring
permanent memory, also plays an important role
in spatial navigation
Hypothalamus Regulates a wide range of behavioral and Hunger, thirst, body temperature, sexual
physiological activities. It controls many activity, arousal, parenting, perspiration,
autonomic functions such as hunger, thirst, body blood pressure, heart rate, shivering,
temperature, and sexual activity. To do this, it pupil dilation, circadian rhythms, sleep
integrates information from many different parts
of the brain and is responsive to a variety of
stimuli including light (it regulates circadian
rhythms), odors (e.g. pheromones), stress, and
arousal (hypothalamic neurons release oxytocin
directly into the bloodstream). Other functions
controlled by it include parenting behavior,
perspiration, blood pressure and heart rate

(continued )
hierarchy

105
Functional

Table 2.
ACI

106
16,1/2

Table 2.
Structure Description Substructure Associated functions

Limbic System Is a group of brain structures incl. amygdala, Amygdala, cingulate gyrus, dentate Memory formation and storage,
hippocampus and hypothalamus that are gyrus, entorhinal cortex, epithalamus, regulating emotion, processing smells,
involved in processing and regulating emotions, hippocampus, hypothalamus sexual arousal
memory and sexual arousal. Is an important
element of the body’s response to stress and is
highly connected to the endocrine and automatic
nervous systems. The limbic system is also
responsible for processing the body’s response to
odors
Middle & Inferior Temporal Involved in a number of cognitive processes, Middle Temporal Gyrus, Inferior Word retrieval, language and semantic
Gyri including semantic memory processing, language Temporal Gyrus memory processing, visual perception,
processes (middle temporal gyrus), visual multimodal sensory integration,
perception (inferior temporal gyrus) integrating autobiographical memory, visual
information from different senses. Interpreting recognition
information about faces and are part of the
ventral visual pathway, which identifies ‘what’
things are. The inferior-temporal gyrus also
participates in some forms of mental imagery
Occipital Lobe Primary visual area of the brain. It receives Cuneus, visual areas VI-V5 Vision
projections from the retina (via the thalamus)
from where different groups of neurons
separately encode different visual information
such as color, orientation, and motion. Pathways
from the occipital lobes reach the temporal and
parietal lobes an are eventually processed
consciously. Two pathways of the information
originating, dorsal and ventral streams. The
dorsal stream projects to the parietal lobes and
processes where objects are located and ventral
stream projects to structures in the temporal lobes
and processes what objects are

(continued )
Structure Description Substructure Associated functions

Parietal Lobe Processes attentional awareness of the Inferior Parietal Lobule, Superior Perception and integration of
environment, is involved in manipulating objects Parietal Lobule, Somatosensory somatosensory information (e.g. touch,
and representing numbers. It has important role Cortex, Precuneus pressure, temperature and pain)
in integrating information from different senses visuospatial processing, spatial attention,
to build a picture of the world. Integrate spatial mapping, number representation
information from ventral visual pathways (what
things are) and dorsal visual pathways (where
things are). This allows us to coordinate our
movements in response to the objects in our
environment. It contains a number of distinct
reference maps of the body, near space and
distant space, which are constantly updated as we
move and interact with the world. The parietal
cortex processes attentional awareness of the
environment, is involved in manipulating objects
and representing numbers
Perirhinal Cortex Important role in object recognition and in storing Object recognition, memory formation
information (memories) about objects. Highly and storage
connected to other brain structures, including the
amygdala, basal ganglia and frontal cortex.
Connections allow it to specialize in associating
objects with sensory information and potential
consequences (e.g., reward)
Pons Is the region in the brain most closely associated Regulating breathing, taste and
with breathing and with circuits that generate autonomic functions
respiratory rhythms. It forms bridge between the
cerebrum and cerebellum and is involved in motor
control, posture, and balance. Involved in sensory
analysis and it is the site at which auditory
information enters the brain.

(continued )
hierarchy

107
Functional

Table 2.
ACI

108
16,1/2

Table 2.
Structure Description Substructure Associated functions

Prefrontal Cortex Important role in higher brain functions. It is a Executive processes (voluntary behavior
critical part of the executive system, which refers e.g., decision-making, planning, problem-
to planning, reasoning and judgement. It is solving, and thinking), attention,
involved in personality and emotion by inhibition, intelligence, social skills
contributing to the assessment and control of
appropriate social behaviors
Premotor Cortex The premotor cortex consists of a narrow region Planning and executing motor
between the prefrontal and motor cortex. It is movements, imitation, empathy
involved in preparing and executing limb
movements and uses information from other
cortical regions to select appropriate movements
Primary Motor Cortex Critical to initiating motor movements. Its areas Coordination and initiation of motor
correspond precisely to specific body parts e.g., movement
leg movements map to the part of the motor
cortex closest to the midline. Not all body parts
are equally represented by surface area or cell
density - representations of the arm hand motor
area occupy the most space in the motor cortex
(unsurprising given their importance to human
behavior). Similarly, representations in the motor
cortex can became relatively large or small with
practice/training

(continued )
Structure Description Substructure Associated functions

Somatosensory Cortex The somatosensory cortex (postcentral gyrus) Precuneus Sensory processing and integration
receives tactile information from the body.
Sensory information is carried to the brain by
neural pathways to the spinal cord, brainstem,
and thalamus, which project to the
somatosensory cortex (which in turns has
numerous connections with other brain areas) It
integrates sensory information (e.g., touch,
pressure, temperature, and pain, spatial
attention), producing a ’homunculus map’, similar
to that of the primary motor cortex. Sensory
information about the feet, for example, map to
the medial somatosensory cortex
Subiculum It is the main output region of the hippocampus Memory processing, regulation of the
and is therefore important to learning and body’s response to stress, spatial
memory. Also plays role in spatial navigation, navigation, information processing
mnemonic (symbol) processing and regulating
the body’ response to stress by inhibiting the HPA
axis
Superior Temporal Gyrus Contains the primary auditory cortex which is Planum polare, planum temporale, Sound processing, speech processing and
responsible for processing sounds. Specific sound Wernicke’s area comprehension, auditory memory
frequencies map precisely onto the primary
auditory cortex. This auditory (or tonotopic) map
is similar to the homunculus map of the primary
motor cortex. Some areas of the superior temporal
gyrus are specialized for processing combinations
of frequencies, and other areas are specialized for
processing changes in amplitude or frequency.
The superior temporal gyrus also includes the
Wernicke’s area, which (in most people) is located
in the left hemisphere. It is the major area
involved in the comprehension of the language

(continued )
hierarchy

109
Functional

Table 2.
ACI

110
16,1/2

Table 2.
Structure Description Substructure Associated functions

Temporal Lobes Large number of substructures, whose functions Amygdala, primary auditory cortex, Recognition, perception (hearing, vision,
include perception, face recognition, object superior temporal gyrus, Wernicke’s smell), understanding language, learning
recognition, memory acquisition, understanding area, middle temporal gyrus, inferior and memory.
language and emotional reactions. Damage to the temporal gyrus, fusiform gyrus, *
temporal lobes can result in intriguing
neurological deficits called agnosias, which refer
to the inability to recognize specific categories
(body parts, colors, faces, music, smells).
*
Note: In some conventions, the amygdala,
cingulate cortex basal ganglia, hippocampus and
parahippocampal gyrus are temporal lobe
structures, in others they are not
Thalamus Involved in relaying information between the – Relaying motor and sensory information,
cortex and brain stem and within different memory, alertness, consciousness,
cortical structures, Because of this role in contributes to perception and cognition
corticocortical interactions, the thalamus
contributes to many processes in brain including
perception, attention, timing and movement. It
plays a central role in alertness and awareness
Ventricles Ventricles are interconnected fluid-filled spaces Cerebral aqueduct, choroid plexus, Cushions and protects the brain
that are extensions of the spinal cord. They have fourth ventricle, lateral ventricle, third
no unique function but provide cushioning ventricle
against brain damage and are useful landmarks
for determining the location of other brain
structures
Wernicke’s Area Functionally defined structure that is involved in – Language comprehension
language comprehension. In about 97% of
humans (including a large majority of left
-handers) major language functions are contained
in the left hemisphere of the brain and for most
people
Interpretation of sensations, Recognition). These differences between the two types of Functional
applications are the results of the functional hierarchy of cognitive functions and consistent hierarchy
with cognitive computational literature such as [44–47].

7. Conclusion
The focus of this study is to describe, what types of computing functions could be considered
cognitive. This paper is neither intended nor possible to present all cognitive functions of the 111
human brain, but rather it is intended to open an idea of cognitive computing functions and
their similarity with human, by examples. To set the cognitive computing functions in the
hierarchy of functions allow us to find the type of cognition and reach the comparability
between the applications.
On this paper, we have constructed the framework to categorize the cognitive computing
functions. The hierarchy structure of cognitive functions underpins the functions of the
brain’s structures. These categories are obtained from the intentional functions of the brain
and from the paths through the transition of the sensory stimuli into the cognitive processes.
To characterize the cognitive computing functions, we have chosen cognitive service
examples which have online demonstrations. Further we describe their functionality and map
the cognitive service example functionalities to the brain functions and present their
similarities. We use the framework, which classify and hierarchize the cognitive service
functions. To compare, we mapped NDC application functionalities to the brain functions and
present their similarities. Finally, we summarize the differences and similarities between the
cognitive computing and NDC function. Both types of examples create the results of cognitive
functions and their differences are attached in the cognitive functions of functional hierarchy
categories in Language, emotional and behavioral functions, memory functions
(e.g., learning), and Cognitive function (e.g. Interpretation of sensations) hierarchies. The
sample material is quantitatively low, so the proposal for the enhancement of the framework
is to implement it by machine learning, in which cognitive functions would gain weights (e.g.,
in each categorical level) and it enable to refine descriptions of functions (i.e., quantitatively
more samples), which does the enhancing of the comparability.
Cognitive calculation functions produce cognition results that are produced, for example,
in a neural network, which mimics the human nervous system. Although the infrastructure
achieves similarity with human cognitive functions, we also need cognitive algorithms such
as Glove, which mimics human cognitive functions. The combination of these features goes
beyond human abilities (i.e., quantity, speed, and variety), such as data retrieval and the
ability to combine data from different sources to generate knowledge.

Compliance with ethical standards


Funding: No funding.
Ethical approval: This article does not contain any studies with human participants or
animals performed by any of the authors.
Conflict of Interest: Author Ulla Gain declares that she has no conflict of interest.
Informed consent: Informed consent was obtained from all individual participants included
in the study.

References
[1] 3D Brain, 3D Brain from Genes to Cognition Online (www.g2conline.org), Cold Spring Harbor
Laboratory, 2017. Available: <https://www.dnalc.org/view/16989-3D-Brain-from-Genes-to-
Cognition-Online-www-g2conline-org-.html>.
ACI [2] medical-dictionary 2017, medical dictionary. Available <http://medicaldictionary.
thefreedictionary.com/somaticþsensoryþarea>.
16,1/2
[3] Bailey Regina, Wernicke’s Area, ThoughtCo., 2016. Available <https://www.thoughtco.com/
wernickes-area-anatomy-373231>.
[4] Bailey Regina, All About Broca’s Area in the Brain, ThoughtCo., 2017. Available <https://www.
thoughtco.com/brocas-area-anatomy-373215>.
[5] Bailey Regina, Parietal Lobes of the Brain, ThoughtCo., 2016. Available <https://www.thoughtco.
112 com/parietal-lobes-of-the-brain-3865903>.
[6] Bailey Regina, Occipital Lobes and Visual Perception, ThoughtCo., 2017. Available: <https://
www.thoughtco.com/occipital-lobes-anatomy-373224>.
[7] Bailey Regina, The Five Senses and How They Work, ThoughtCo., 2016. Available <https://
www.thoughtco.com/five-senses-and-how-they-work-3888470>.
[8] T. Kitamura, S.K. Ogawa, D.S. Roy, T. Okuyama, M.D. Morrissey, L.M. Smith, R.L. Redondo, S.
Tonegawa, Engrams and circuits crucial for systems consolidation of a memory, Science (2017),
https://doi.org/10.1126/science.aam6808.
[9] Bergland Christopher, MIT Scientists Identify Brain Circuits of Memory Formation, Psychology
today, 2017. Available <https://www. psychologytoday.com/blog/the-athletes-way/201704/mit-
scientists-identifybrain-circuits-memory-formation>.
[10] S. Manninen, L. Tuominen, R. Dunbar, T. Karjalainen, J. Hirvonen, E. Arponen, R. Hari, I.P.
J€a€askel€ainen, M. Sams, L. Nummenmaa, Social laughter triggers endogenous opioid release in
humans, J. Neurosci. (2017), https://doi.org/10.1523/JNEUROSCI.0688-16.2017.
[11] R.J. Dolan, Emotion, Cognition, and Behavior, Science, Nov 8, 2002; 298, 5596; SciTech Premium
Collection PF. 1191.
[12] Bailey Regina, Frontal Lobes and Their Function, ThoughtCo., 2017. Available: <https://www.
thoughtco.com/frontal-lobes-anatomy-373213>.
[13] Knierim James, Motor Cortex, Department of Neuroscience, The Johns Hopkins University, 2017,
Available: <http://neuroscience.uth.tmc.edu/s3/chapter03.html>.
[14] Poo et al., What is memory? The present state of the engram, BMC Biol. (2016), https://doi.org/10.
1186/s12915-016-0261-6.
[15] A. Bechara, H. Damasio, A.R. Damasio, Emotion, Decision Making and the Orbitofrontal Cortex,
Cerebral Cortex Mar 2000, 10, Oxford University Press, 2000, pp. 295–307.
[16] Bergland Christopher, 5 New Studies Report Previously Unknown Cerebellum Functions,
Psychology today, 2017. Available <https://www.psychologytoday.com/blog/the-athletes-way/
201704/5-new-studies-reportpreviously-unknown-cerebellum-functions>.
[17] Danny A. Spampinato, Hannah J. Block, Pablo A. Celnik, Cerebellar–M1 connectivity changes
associated with motor learning are somatotopic specific, J. Neurosci. 37 (9) (2017) 2377, https://doi.
org/10.1523/JNEUROSCI.2511-16.2017.
[18] Giovannucci Andrea, Badura Aleksandra, Deverett Ben, Najafi Farzaneh, D. Pereira Talmo, Gao
Zhenyu, Ozden Ilker, D. Kloth Alexander, Pnevmatikakis Eftychios, Paninski Liam, I. De Zeeuw
Chris, F. Medina Javier, S.-H. Wang Samuel, Cerebellar granule cells acquire a widespread predictive
feedback signal during motor learning, Nat. Neurosci. (2017), https://doi.org/10.1038/nn.4531.
[19] Mark J. Wagner, Kim Tony Hyun, Savall Joan, Mark J. Schnitzer, Luo Liqun, Cerebellar granule
cells encode the expectation of reward, Nature (2017), https://doi.org/10.1038/nature21726.
[20] K.L. Parker, Y.C. Kim, R.M. Kelley, A.J. Nessler, K.-H. Chen, V.A. Muller-Ewald, N. C. Andreasen,
N.S. Narayanan, Delta-frequency stimulation of cerebellar projections can compensate for
schizophrenia-related medial frontal dysfunction, Mol. Psychiatry (2017), https://doi.org/10.1038/
mp.2017.50.
[21] D. Caligiore, F. Mannella, M.A. Arbib, G. Baldassarre, Dysfunctions of the basal ganglia-
cerebellar-thalamo-cortical system produce motor tics in Tourette syndrome, PLoS Comput Biol
13 (3) (2017) e1005395, https://doi.org/10.1371/journal.pcbi.1005395.
[22] A. Wright, Neuroscience online an electronic textbook for the neurosciences, UTHealth, McGovern Functional
Medical School, 2017, open Access <http://neuroscience.uth.tmc.edu/s4/chapter09.html>.
hierarchy
[23] Y. Wang, Y. Wang, S. Patel, D. Patel, A layered reference model of the brain (LRMB), IEEE Trans.
Syst. Man Cybernet. Part C (Applications and Reviews), 36 (2) (2006) 124–133.
[24] Bergland Christopher, Cognitive Benefits of Exercise Outshine Brain-Training Games,
Psychology today, 2017. Available <https://www.psychologytoday.com/blog/the-athletes-way/
201704/cognitive-benefits-exerciseoutshine-brain-training-games>.
[25] Dustin J. Souders, Walter R. Boot, Blocker Kenneth, Vitale Thomas, Nelson A. Roque, Charness Neil,
113
Evidence for narrow transfer after short-term cognitive training in older adults, Front. Aging Neurosci.
2017 (2017) 9.
[26] McKinsey Global Institute, A future that works: automation, employment, and productivity.
January 2017. Available from: <http://www.mckinsey.com/global-themes/digital-disruption/
harnessing-automation-for-a-future-thatworks>.
[27] IBM, Overview of the IBM Watson Visual Recognition service, demonstration, 2017. Available from:
<https://visual-recognition-demo.mybluemix.net/?cm_mc_uid534548500692914900141181&cm_mc_
sid_5020000051496864053&cm_mc_sid_5264000051496864053>.
[28] MS, Speaker Recognition API, Demonstration, 2017. Available <https://azure.microsoft.com/en-
us/services/cognitive-services/speaker-recognition/>.
[29] IBM, Tone Analyzer, Watson, 2017. Available: <https://tone-analyzer-demo.mybluemix.net/?cm_
mc_uid544457173584014971676232&cm_mc_sid_5020000051497167623&cm_mc_sid_
5264000051497167623>.
[30] IBM, Overview of the IBM Watson Visual Recognition service, Watson, 2017. Available from:
<https://www.ibm.com/watson/developercloud/doc/visualrecognition/index.html>.
[31] S. Earley, Cognitive Computing, Analytics, and Personalization, IT Pro, nro July/August, 2015,
pp. 12–18.
[32] IBM, IBM Visual Recognition, How it works, Watson, 2017. Available from: <https://www.ibm.
com/watson/developercloud/visual-recognition.html#how-it-works-block>.
[33] IKEA, TJENA, 2017. Available from: <http://www.ikea.com/us/en/catalog/products/80364398/
#/70269448>.
[34] M. Moore, Image of Trump, 2017. Available from: <https://michaelmoore.com/trumpwillwin/>.
[35] MS (2017a) Speaker Recognition API, Documentation, Available https://docs.microsoft.com/en-us/
azure/cognitive-services/speaker-recognition/home.
[36] John H.L. Hansen, Taufiq Hasan, Speaker recognition by machines and humans, IEEE Signal
Processing Magazine, November 2015, pp. 74–99.
[37] IBM, About Tone Analyzer, Watson, 2017. Available: <https://www.ibm.com/watson/
developercloud/doc/tone-analyzer/index.html>.
[38] Jeffrey Pennington, Richard Socher, Christopher D. Manning, GloVe: Global Vectors for Word
Representation, 2014, Available: <https://nlp.stanford.edu/pubs/glove.pdf>.
[39] Stanford, Lecture 3jGloVe: Global Vectors for Word Representation, Stanford University School
of Engineering, 2017. Available: <https://www.youtube.com/watch?v5ASn7ExxLZws>.
[40] About Tone Analyzer, General purpose tones, Watson, Available: <https://www.ibm.com/
watson/developercloud/doc/tone-analyzer/using-tone.html#tones>.
[41] John E. Kelly, Computing, cognition and the future of knowing How humans and machines are
forging a new age of understanding, IBM, 2015. Available: <http://researchweb.watson.ibm.com/
software/IBMResearch/multimedia/Computing_Cognition_WhitePaper.pdf>.
[42] Apple, Use the Activity app on your Apple Watch, 2017. Available: <https://support.apple.com/
en-us/HT204517>.
[43] Zac Hall, How to use or enable/disable Parked Car alerts from Maps on Ios 10 for iPhone,
9TOP5Mac, 2016, Available: <https://9to5mac.com/2016/10/06/how-to-use-turn-off-on-parked-car-
alerts-maps-ios-10-iphone/>.
ACI [44] M. Srivathsan, K. Yogesh Arjun, Health monitoring system by prognotive computing using big
data analytics, Proc. Comput. Sci., 50 (2015) 602–609.
16,1/2
[45] Y. Chen, J.D.E. Argentinis, G. Weber, IBM Watson: how cognitive computing can be applied to
big data challenges in life sciences research, Clin. Therapeutics 38 (4) (2016) 688–701.
[46] B. Williamson, Computing brains: learning algorithms and neurocomputation in the smart city,
Intorm. Commun. Soc. 20 (1) (2017) 81–99.
[47] IBM Bluemix, The science behind the service. Available: <https://console.bluemix.net/docs/
114 services/personality-insights/science.html#science>.

Appendix A
The 3D Brain is a mobile application which represent the brain in three-dimensional space. It contains 29
interactive structure (see more in detail: https://www.dnalc.org/view/16989-3D-Brain-from-Genes-to-
Cognition-Online-www-g2conline-org-.html). Table 1 summarize the content of the 3D Brain (i.e., the
brain structures, associated functions of each structure and the descriptions).

Appendix B
The IBM Tone Analyzer service describes the social tendencies, emotions and language styles in context
as follow [36]:
the social tendencies:
“Openness: The extent a person is open to experience a variety of activities.
“Conscientiousness: The tendency to act in an organized or thoughtful way.”
“Extraversion: Is the tendency to seek stimulation in the company of others.”
“Agreeableness: Is the tendency to be compassionate and cooperative towards others.”
“Emotional Range: The extent a persona’s emotion is sensitive to the environment.”
the emotions:
“Anger: Is Evoked due to injustice, conflict, humiliation, negligence or betrayal. If anger is active, the
individual attacks the target, verbally or physically. If anger is passive, the person silently sulks and feels
tension and hostility”.
“Disgust: An emotional response of revulsion to something considered offensive or unpleasant. It is a
sensation that refers to something revolting.”
“Fear: Is a response to impending danger. It is a survival mechanism that is a reaction to some
negative stimulus. It may be a mild caution or an extreme phobia.”
“Joy: Joy or happiness has the shades of enjoyment, satisfaction and pleasure. There is a sense of well-
being, inner peace, love, safety and contentment.”
“Sadness: Indicates a feeling of loss and disadvantage. When a person can be observed to be quiet, less
energetic and withdrawn, it may be inferred that sadness exists.”
the language styles:
“Analytical: It means A person’s reasoning and analytical attitude about things. ”
“Confident: A person’s degree of certainty. ”
“Tentative: A person’s degree of inhibition. ”

Appendix C
To illustrate differences in content, the human cognitive functions envisaged by this technique were
listed in summary for each cognitive service. At First, the functions of the cognitive application and after
the functions used by NDC application. Table 3 the cognitive functions used in cognitive applications.
Service The cognitive functions
Functional
hierarchy
IBM Visual Learning, the higher order concept formation, visual perception, interpretation of
recognition visual information, recognition; object and face recognition; spatial mapping
MS Speaker Learning of unconditioned associations, Association, Learning, Sensory
Recognition interpretation, Interpretation of the audio stimuli, Tonotopic map functions, Speech
processing, Language processing, Personality expression, Analogy, Decision making,
Recognition 115
IBM Tone Analyzer Language processes, Language processing, Reading functions, Semantic memory
processing, the Learning of conditioned associations, Learning, the Comprehension of
complex syntax, the Comprehension of reading functions, Language understanding,
Language memory processing, Executive processing, Association, Recognition,
Abstract thinking, Deduction, Analysis, Analogy, Quantification, Cognition,
Interpretation of sensations, Sociability, Inhibition, Empathy, Emotion, Emotion
processing, Emotional memory, Feeling states, Posture, Personality, Personality Table 3.
expression, Personality traits, Emotion, Emotion processing, Association, Higher- The cognitive services
order concept formation, Abstract thinking, Analogy, Reasoning, Inhibition, and cognitive functions
Personality expression, Posture, Cognition of them.

These cognitive functions belong to the following cognitive function hierarchies (Figure 13),
Language functions (e.g., language understanding), Emotion and behavior functions (e.g., Emotion
processing), Memory functions (e.g., learning), cognitive functions hierarchies (e.g., Interpretation of
sensations, recognition, analogy).
The similar human cognitive functions of the cognitive functions of NDC applications are listed
Table 4. The functions divide into three cognitive function hierarchies (Figure 14): Memory functions
(41%); Visual, Auditory, Motor, Sensation and Homeostasis functions (31%); Cognitive functions (28%).
Language - and Emotion and behavior functions (0%) (i.e., level II) of hierarchical cognitive functions are
not in use.

Figure 13.
The Cognitive
application functions
by cognitive functions
functional hierarchy.
ACI Application The cognitive functions
16,1/2
Apple Activity app Receive sensory information, Processing sensory information, Sensory
interpretation, Information processing,
Preprocessing the memorable information, Working memory, Attention, Timing,
Memory processing, Analysis,
Analogy, Quantification, Maintain goals, Synthesis, Memorizing, New memory
116 formation, Memory storage,
Memory retrieval
Apple iPhone Parked Receive sensory information, Working memory, Processing sensory information,
Car Sensory interpretation,
Integrate information (Where the things are), Integrate information (What the things
are),
Preprocessing the memorable information, Information processing, Association,
Higher-order concept formation, Memory formation, Memory storage, Choosing the
Table 4. memories to stored, Choosing the place where memories are stored, Spatial mapping,
The NDC applications Memorizing, New memory formation, Memory, Memory processing, Memory
and cognitive functions functions, Memory retrieval, Reference map functions of near and distance space,
of them. Executive processing, Spatial navigation

Figure 14.
The NDC application’s
functions by cognitive
functions functional
hierarchy.

Corresponding author
Ulla Gain can be contacted at: [email protected]

For instructions on how to order reprints of this article, please visit our website:
www.emeraldgrouppublishing.com/licensing/reprints.htm
Or contact us for further details: [email protected]
Paper VI
Authors: Gain U, Koponen M., Hotti V.

Year: 2018.

Article title: Behavioral interventions from trait insights.

Book: Well-Being in the Information Society. Fighting Inequalities

7th International Conference, WIS 2018, Turku, Finland, August 27-29, 2018, Proceedings.

Communications in Computer and Information Science, 907:14-27, Springer, Cham,


https://www.springerprofessional.de/en/behavioral-interventions-from-trait-insights/16029424

Publisher: Springer International Publishing

Permissions from co-authors via email: Koponen Mikko received 9.11.2021 at16:02; Hotti Virpi received 9.11.2021 at
16:33

Consent to Publish, Communications in Computer and Information Science:

“§2 Rights Retained by Author,…Author retains the right to use his/her Contribution for his/her further scientific
career by including the final published paper in his/her dissertation or doctoral thesis provided acknowledgment is
given to the original source of publication. Author also retains the right to use, without having to pay a fee and
without having to inform the Publisher, parts of the Contribution (e.g. illustrations) for inclusion in future work.”
(Consent to Publish, Communications in Computer and Information Science, 2018)
Behavioral Interventions from Trait Insights

Ulla Gain(&), Mikko Koponen, and Virpi Hotti

School of Computing, University of Eastern Finland, 70211 Kuopio, Finland


{gain,virpi.hotti}@uef.fi, [email protected]

Abstract. Individuals have the stated and unstated beliefs and intentions. The
theory of planned behavior is expressed by the mathematical function where
beliefs have empirically derived coefficients. However, personality traits can
help account for differences in beliefs. In this paper, we will find out how we can
amplify behavioral interventions from text-based trait insights. Therefore, we
research techniques (e.g., sentence and word embedding) behind text-based
traits. Furthermore, we exemplify text-based traits by 52 personality character-
istics (35 dimensions and facets of Big Five, 12 needs and five values) and 42
consumption preferences via API of the IBM Watson™ Personality Insights
service. Finally, we discuss the possibilities of behavioral interventions based on
the personality characteristics and consumption preferences (i.e., text-based
differences and similarities between the individuals).

Keywords: Behavior  Belief  Intention  Insight  Personality trait

1 Introduction

Individual insights challenge our privacy. The GDPR (The General Data Protection
Regulation) is setting the controllers and the processors to be responsible when they
process the personal data “relating to an identified or identifiable individual” [1].
Personal data have been collected with and without consents of the humans that either
identified or directly or indirectly identifiable natural persons (i.e., data subjects) [2].
For example, personal interests derived from tracking the use of internet web sites,
personal or behavioral profile, and religious or philosophical beliefs can be used to
identify the natural person [3]. Psychographics data (e.g., attitudes, interests, person-
ality, and values) are used to fashion individually-targeted actions. For example, the
Cambridge Analytica classified personalities according to the OCEAN scale (Open-
ness, Conscientiousness, Extroversion, Agreeableness, and Neuroticism) and fashioned
individually-targeted messages [4].
The intentions are the immediate determinants of behavior. It is possible to predict
intentions the regressions of which are based on three independent determinants (i.e.,
attitude toward the behavior, subjective norm, and perceived behavioral control) of the
intentions. The theory of planned behavior is expressed by the mathematical function
BI = a * AB + b * SN + c * PBC (where BI = behavioral intention, AB = attitude
toward behavior, SN = social norm, PBC = perceived behavioral control, a, b and
c = empirically derived weights/coefficients). The coefficients of the regressions are

© Springer Nature Switzerland AG 2018


H. Li et al. (Eds.): WIS 2018, CCIS 907, pp. 14–27, 2018.
https://doi.org/10.1007/978-3-319-97931-1_2
Behavioral Interventions from Trait Insights 15

defined for the intentions of different kinds such as using cannabis, donating blood,
buying stocks and engaging in leisure activities [5].
Background factors can help account for differences in beliefs. The background
factors are divided into three groups. Personal factors are general attitudes, personality
traits, values, emotions and intelligence. Social factors are age, gender, race, ethnicity,
education, income and religion. Information factors are experience, knowledge and
media exposure [5].
Behavioral interventions generally expose people to new information designed to
change their behavioral, normative, and control beliefs. Hence, background factors
(e.g., personality traits) can help account for differences in beliefs. For example, traits
(e.g., helpfulness and independence) are dispositional explanations of behavior [5].
A composite application (e.g., mashups) can build by combining existing functions
(e.g., APIs). For example, cognitive services or mashups of them [6] are used to
manifest the behaviors and intents of the individuals. The manifested insights are
algorithmically inferred from the raw data. For example, a speech or text can be
extracted into tones and traits [7] the whole set of which forms a personal and
behavioral profile with summarizing and manifesting preferences. There might be
similarities between the traits of the individuals that are doing the same (or similar)
things or are interested in the same (or similar) things.
In this paper, we figure out techniques that are behind text-based traits. Moreover,
we produce text-based traits by the IBM Watson™ Personality Insights service [8]. We
will clarify whether the traits insights of the IBM Watson™ Personality Insights service
can be used in behavioral interventions. The IBM Watson™ Personality Insights ser-
vice uses an open-vocabulary approach to infer personality characteristics (i.e., traits)
from text input. The service deploys the word-embedding technique GloVe (Global
Vectors for Word Representation) to attain the words of the input text into a vector
representation (Sect. 2). Thereafter, the machine-learning algorithm calculates per-
centiles and raw scores of the personality characteristics from word-vector represen-
tations and consumptions preference scores from the personality characteristics
(Sect. 3). Finally, we discuss whether behavioral interventions from text-based trait
insights are reliable (Sect. 4).

2 From Text to Vectors

Raw text has to transform into a structured form. A bag-of-words is a sparse matrix
where most of the elements are zero arises in real-world problems [9]. The bag-of-
words is commonly followed by tf-idf (i.e., term frequency and inverse document
frequency) weighted matrix weighting or we will produce the tf-idf weighted matrix
from the text. When, we have the tf-idf weighted matrix we have to reduce dimen-
sionality using separate methods (e.g., singular-value decomposition, SVD) or “we can
feed the tf-idf weighted matrix directly into a neural net to perform dimensionality
reduction” [10].
The dimensions of the matrixes have to reduce. Therefore, word or sentence
embedding (Sect. 2.1) with several corpora of text [11] are used. The meaning of the
word embedding is exemplified by the GloVe algorithm (Sect. 2.2).
16 U. Gain et al.

2.1 Word and Sentence Embedding


Word or sentence embedding concerns natural language processing techniques via
neural networks, or via matrix factorization. Dimensionality reduction techniques are
applied to the datasets of co-occurrence statistics between words on a corpus of text.
Words are embedded in a continuous vector space where semantically similar
words are embedded near to each other. The words that occur in the same contexts tend
to share a similar semantic meaning. For example, the Turku NLP Group [12] has
implemented the demo to illustrate the nearest words, the similarity of two words, and
word analogy.
Word vectors are generated by algorithms. Each center word (cw) has surrounding
words the symbol of which is sw (Fig. 1). The idea to convey the words of the sentence
(i.e., text) into a matrix presentation is capture the embedded semantic and syntactic
regularities from the sentence.

Fig. 1. Example sliding window size 5 from a word to a vector

Sentence embedding uses encoder-decoder models such as LSTM (long short-term


memory) and CNN (convolutional neural network) to solve the sequence to sequence
tasks [13] such as summarization [14]. Both supervised and unsupervised learning are
applied [15].

2.2 Example of the GloVe Application


One of the most famous word embedding algorithm is GloVe [16]. It is used instead of
word2vec [17], for example, in the IBM Watson™ Personality Insights service [18].
The Global Vectors for Word Representation (GloVe) the main function of which is to
build co-occurrence function where the corpus is approved and co-occurrence values
are calculated.
First, we created a vocabulary having 19 words (i.e., and, at, athlete, biathlon,
compete, competition, distance, gun, guns, himself, human, interface, like, result,
shooting, skiing, sometimes, sport, 50 km) from the statements as follows:
0. “gun interface skiing
1. human interface gun
Behavioral Interventions from Trait Insights 17

2. sport interface human


3. biathlon interface sport
4. athlete interface biathlon
5. athlete sport competition at distance 50 km
6. athlete result distance 50 km
7. athlete like shooting and skiing
8. athlete like guns and skiing
9. Sometimes athlete compete competition
10. Sometimes athlete compete himself”)
The vocabulary is tabulated tuples where each word has an index and number of
occurrences. For example, the word ‘gun’ has the index zero and it occurs two times
(0,2), the word ‘interface’ has the index one and it occurs five times (1,5), and the word
‘skiing’ has the index 2 and it occurs three times (2,3).
Second, the co-occurrences of the words are calculated [19]. The co-occurrences
are tabulated triples where the first item is the index of the vocabulary word, the second
item is the index of the context word, and the third item is the ratio of co-occurrence
probability that the first item appears in the context of the second item. For example,
the word ‘gun’ the index of which is zero concerns three triples (0,1,2.0), (0,2,0.5),
(0,3,0.5) where the index one is the word ‘interface’, index two is the word ‘skiing’,
and the index 3 is the word ‘human’.
Third, the GloVe algorithm produces a co-occurrence matrix where the rows
illustrate the vocabulary words and the columns illustrate the statements. For example,
the statement “athlete sport competition at distance 50 km” the words of which are
indexed as [4, 6–10] and the corresponding values for each word are calculated in the
co-occurrence matrix (Table 1).
We will apply the model (i.e., the co-occurrence matrix), for example, for similarity
evaluation. For example, there are 15 similarities (and, at, compete, competition, dis-
tance, gun, guns, human, interface, like, result, skiing, sport, sometimes, 50 km) of the
statements in our example. In general, GloVe is very useful with natural language
processing tasks such as demonstrating semantic and syntactic regularities. It can even
find the correct words from outside of the ordinary vocabulary [19].

3 Example of Trait-Based Insights

First, we research techniques to get personality characteristics and consumption pref-


erences via API of the IBM Watson™ Personality Insights service (Sect. 3.1). Second,
we analyze trait insights (Sect. 3.2).

3.1 Traits via API of the IBM Watson™ Personality Insights Service
Web data are messy and it is time consuming to collect data from several channels by
different APIs. Therefore, the Futusome service is used to make a data dump of web
data. The data dump was created at 22.02.2018 11:15 UTC and it contains 53294
messages (Table 2) concern ‘ampumahiihto’ (biathlon) [20]. The data dump is the
18 U. Gain et al.

Table 1. Example of the calculated values of the co-occurrence matrix


Vocabulary vs. statements 0 1 2 3 4 5 6 7 8 9
0 - gun
1 - interface
2 - skiing
3 - human
4 - sport −0.0729962
5 - biathlon
6 - athlete −0.318759
7 - competition 0.17229
8 - at 0.132553
9 - distance −0.529117
10–50 km −0.397189
11 - result
12 - like
13 - shooting
14 - and
15 - guns
16 - sometimes
17 - compete
18 - himself

Excel file the fields of which are the author name, time stamp, channel type, message
link, and message text.
The messages of each author are combined into 424-44168 words before the API
call. Moreover, the retweets and duplicates of the forum posts are removed. Finally,
there were 514 authors of the messages. It is possible to use the IBM Watson services
in Python [21]. Therefore, Python is use to combine the messages, to call the IBM
Personality Insights API [22], and to modify the API result by trait ids (Appendix A).

3.2 Percentiles and Scores


The IBM Watson™ Personality Insights service computes personality characteristics -
five dimensions (i.e., Big Five or OCEAN) and 30 facets, 12 needs and five values. Big
Five describes “how a person generally engages with the world”, the needs describe
what a person “hope to fulfill when [she] consider[s] a product or service”, and the
values (or beliefs) “convey what is the most important to [a person]” [18]. There are
explanations for the high values of the needs and values. However, there are expla-
nations both the high and low values of the dimensions and facets of the personality
[23]. For example, if the cautiousness facet is 0.79 the explanation of which is that the
individual is deliberate and she carefully thinks through decisions before making them
[23]. There are percentiles and raw scores for the personality traits (i.e., dimensions and
facets of Big Five, needs [24] and values [25]) – percentiles are normalized scores and
raw scores are as the scores that based solely on the text [18]. It is observable that the
Behavioral Interventions from Trait Insights 19

Table 2. Number of author per channel


Channel Number of authors
blog_answer 161
blog_comment 136
blog_post 718
facebook_comment 2918
facebook_event 53
facebook_link 3082
facebook_photo 1356
facebook_post 111
facebook_status 674
facebook_video 299
forum_post 9916
googleplus_post 28
googleplus_post_comment 2
instagram_image 580
instagram_image_comment 102
news_comment 4496
twitter_retweet 4207
twitter_tweet 23935
youtube_video 34
youtube_video_comment 11

interpretations of the results differ a lot if we make the interpretation based on the
percentiles or based on the raw scores (Appendix B).
IBM has identified 104 consumption preferences correlated with personality. IBM
has selected 42 consumption preferences “for which personality-based classification
performed at least 9 percent better than random classification” [18]. The consumptions
preference scores (Appendix C) are derived from the personality characteristics (i.e.,
Big Five, needs and values) inferred from the input text [26]. There are eight categories
having 42 preferences as follows [27]: 12 shopping preferences, 10 movie preferences,
nine (9) music preferences, five (5) reading and learning preferences, three (3) health
and activity preferences, one (1) entrepreneurship preference, one (1) environmental
concern preference, and one (1) volunteering preference.
We realized that some personality characteristics correlate statistically significant
with the consumption preferences. There are some minor differences in correlation
coefficients when we compare correlations within the consumptions preferences versus
the raw scores of the personality characteristics and the consumptions preferences
versus the percentiles of the personality characteristics. However, we do not know how
the words of the texts are mapped into the personality characteristics, as well we do not
know how the personality characteristics affect the consumption preferences. There-
fore, correlation calculations (Fig. 2) illustrate the relationships. The Person correla-
tions are calculated by the rcorr function of the Hmisc (Harrell Miscellaneous) package
[28].
20 U. Gain et al.

Fig. 2. Correlations between the percentiles of the personality characteristics and the
consumption preferences, the cross is used when the statistical significance is <0.05

4 Discussion

The IBM Watson™ Personality Insights service computes personality characteristics -


five dimensions and 30 facets of the Big Five, 12 needs and five values. We will
explain the GloVe algorithm used in IBM Watson™ Personality Insights service.
However, we did not know exactly how the values of the traits are calculated. How-
ever, the items of the International Personality Item Pool (IPIP) has been used [18].
There are many items (i.e., phrases or statements) used to describe people’s behaviors.
Answers (e.g., Very Inaccurate, Moderately Inaccurate, Neither Accurate, Nor Inac-
curate, and Moderately Accurate) of the items (i.e., accuracies of the statements) affect
negatively or positively to the results of the Big Five (i.e., dimensions and facets). For
example, item-total correlations for the full sample have been calculated [29]. In
Behavioral Interventions from Trait Insights 21

general, the IPIP items are research-based relevant when people’s behaviors are
intervened. For example, the anger facet is related with the items “Get angry easily”,
“Get irritated easily”, “Lose my temper”, and “Am not easily annoyed” - the answers of
the items “Get angry easily”, “Get irritated easily”, and “Lose my temper” have pos-
itive effects and the answers of the item “Am not easily annoyed” has a negative effect
on the value of the anger facet [29].
Open and standardized measures in the pipelines from raw data to traceable results
are required (Fig. 3). In general, when we assess the results of the calculations we have
to understand the algorithms of APIs and how we interpret the results of the APIs.

Fig. 3. Example of the pipeline

In many cases, the results of the APIs have to manipulate for further processing.
Therefore, we have to have analytical, even statistical, tools for get interpretable and
significant insights. Only some of the insights are derived straightforward way from the
raw data. More and more, the insights are inferred by machine learning. For example,
fitness trackers (e.g., BiAffect [30]) and other deep learning solutions based on Keras
the examples of which give ideas of versatile uses. Keras [31] is the frontend that will
use TensorFlow [32] as its tensor (i.e., an organized multidimensional array of
numerical values [33]) manipulation library.
In future, we will build proof-of-concepts or solutions where the well-established
corpora and IPIP items are used in word or sentence embedding. There seems to be
semantic roles such as actions (e.g., avoid and believe) and objects (e.g., pressure and
rule) in the IPIP items that can be related to certain personality characteristics which
can be applied in word or sentence embedding. Furthermore, we will build the models
where open source APIs and other building blocks (e.g., TPOT, the automatic machine
learning tool [34]) are adapted within the artificially intelligent technologies such as
Keras and TensorFlow.
22 U. Gain et al.

Appendix A – Example of Big Five Traits and Facets of Them


Behavioral Interventions from Trait Insights 23

Appendix B – Percentiles and Raw Scores for Personality


Characteristics
24 U. Gain et al.
Behavioral Interventions from Trait Insights 25

Appendix C – Consumption Preferences


26 U. Gain et al.

References
1. The International Organization for Standardization: ISO 5127:2017(en) Information and
documentation — Foundation and vocabulary (2017). https://www.iso.org/obp/ui#iso:std:
iso:5127:ed-2:v1:en
2. The European Union: The European Parliament or the Council Regulation (EU) 2016/679 of
the European Parliament and of the Council of 27 April 2016 on the protection of natural
persons with regard to the processing of personal data and on the free movement of such
data, and repealing Directive 95/46/EC (General Data Protection Regulation) (2016). https://
eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32016R0679&from=EN
3. The International Organization for Standardization, the International Electrotechnical
Commission: ISO/IEC 29100:2011(en) Information technology — Security techniques —
Privacy framework (2011). http://standards.iso.org/ittf/PubliclyAvailableStandards/index.
html
4. Bruce, P.: The Real Facebook Controversy. Data Science Central (2018). https://www.
datasciencecentral.com/profiles/blogs/the-real-facebook-controversy
5. Ajzen, I.: Attitudes, Personality and Behavior. Open University Press, Maidenhead (2005)
6. IBM: API mashup guide (2017). https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?
htmlfid=LBS03048USEN
7. Gain, U., Hotti, V.: Tones and traits - experiments of text-based extractions with cognitive
services. Finnish J. eHealth eWelfare 9(2–3), 82–94 (2017). https://doi.org/10.23996/fjhw.
61001
8. IBM: Getting started tutorial (2018). https://console.bluemix.net/docs/services/personality-
insights/getting-started.html#gettingStarted
9. Davis, T.: Sparse matrix algorithms and software (2018). http://faculty.cse.tamu.edu/davis/
research.html
10. Ingersoll, K.: An Intro to Natural Language Processing in Python: Framing Text
Classification in Familiar Terms (2018). https://www.datasciencecentral.com/profiles/
blogs/an-intro-to-natural-language-processing-in-python-framing-text
11. Wikipedia: List of text corpora (2018). https://en.wikipedia.org/wiki/List_of_text_corpora
12. Turku NLP Group: Models (2018). http://bionlp-www.utu.fi/wv_demo/
13. Zeng, W., Luo, W., Fidler, S., Urtasun, R.: Efficient Summarization with Read-Again and
Copy Mechanism (2017). https://arxiv.org/abs/1611.03382
14. Google: Neural Machine Translation (seq2seq) Tutorial (2018). https://www.tensorflow.org/
tutorials/seq2seq
15. Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised Learning of Sentence Embeddings using
Compositional n-Gram Features (2017). https://arxiv.org/pdf/1703.02507.pdf
16. Pennington, J., Socher, R., Manning C.D.: GloVe: Global Vectors for Word Representation
(2014). https://www.aclweb.org/anthology/D14-1162
17. Google: word2vec (2013). https://code.google.com/archive/p/word2vec/
18. IBM: The science behind the service (2018). https://console.bluemix.net/docs/services/
personality-insights/science.html#science
19. Pennington, J., Socher, R., Manning, C.D.: GloVe: Global Vectors for Word Representation
(2014). https://nlp.stanford.edu/projects/glove/
20. Futusome: Services (2017). https://www.futusome.com/en/services/
21. GitHub: watson-developer-cloud/python-sdk (2018). https://github.com/watson-developer-
cloud/python-sdk/tree/develop/examples
22. IBM: Personality Insights – API reference (2017). https://www.ibm.com/watson/
developercloud/personality-insights/api/v3/?python
Behavioral Interventions from Trait Insights 27

23. IBM: Personality models (2018). https://console.bluemix.net/docs/services/personality-


insights/models.html#models
24. IBM: Needs (2017c). https://console.bluemix.net/docs/services/personality-insights/needs.
html#needs
25. IBM: Values (2017d). https://console.bluemix.net/docs/services/personality-insights/values.
html#values
26. IBM: Interpreting the numeric results (2018). https://console.bluemix.net/docs/services/
personality-insights/numeric.html#numeric
27. IBM: Consumption preferences (2018). https://console.bluemix.net/docs/services/
personality-insights/preferences.html#preferences
28. The Comprehensive R Archive Network: Hmisc: Harrell Miscellaneous (2018). https://
CRAN.R-project.org/package=Hmisc
29. Johnson, J.: Measuring thirty facets of the five factor model with a 120-item public domain
inventory: development of the IPIP-NEO-120. J. Res. Pers. 51, 78–89 (2014). https://doi.
org/10.1016/j.jrp.2014.05.003
30. BiAffect: BiAffect (2017). http://www.biaffect.com/
31. Keras: The Python Deep Learning library (2018). https://keras.io/
32. TensorFlow: An open source machine learning framework for everyone (2019). https://
www.tensorflow.org/
33. Wikipedia. Tensor (2018). https://en.wikipedia.org/wiki/Tensor
34. GitHub: EpistasisLab/tpot (2018). https://github.com/EpistasisLab/tpot
Paper VII
Authors: Gain U, Hotti V.

Year: 2020.

Article title: Awareness of automation, data origins, and processing stakeholders by parsing the General Data
Protection Regulation sanction-based articles.

Journal: Electronic government. DOI: 10.1504/EG.2021.10034597

Publisher: Inderscience publishers

Permissions from co-authors via email: Hotti Virpi received 9.11.2021 at 7:57

Inderscience retains the copyright of the article

Reproduced with permission in this thesis in form of Accepted Manuscript by Jeanette Brooks Publications Director,
Inderscience Enterprises Limited, SWITZERLAND via email 13.11.2021 at 14:32
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
Awareness of automation, data origins, and processing stakeholders by parsing the General Data
Protection Regulation sanction-based articles
Author(s): Ulla Gain*, Virpi Hotti
* Title, University of Eastern Finland, School of Computing Tel./fax:
Email address: [[email protected], [email protected]]
URL:

Abstract: Datafication forces to be aware of data origin and to be aware of data-based insights. Further, weak awareness of
automated data processing and processing stakeholders challenge compliant data usage. The research concerns the General Data
Protection Regulation (GDPR) articles that define sanction criteria to figure out whether awareness of automation, data categories, and
processing stakeholders are increased and whether cognitive services are useful to manifest indicative semantic roles. All 49 articles
(5-9, 11-22, 25-39, 41-49, 58, 85-91) concern processing stakeholders, eight articles (9, 13, 14, 15, 20, 21, 35 and 37) concern data
categories, and seven articles (13, 14, 15, 20, 22, 35, and 47) concern automation. Krippendorff’s Alpha was used to estimate the
intercoder reliability between two coders (the cognitive service and human interpreter), and the value 0,85 refers that the parsing
capability of the IBM Watson Natural Language Understanding Text Analysis cognitive service amplifies human interpretations and
interventions.

Keywords: automation, data governance, data category, GDPR, natural language processing, content analysis, semantic roles

1 Introduction
Datafication forces to be aware of data origin and to be aware of data-based insights. Datafication means that our actions are
transformed into quantified data, which allows “real-time tracking and predictive analysis” (van Dijck 2014) as well as manifesting
psychographic data, for example, personality and values. Data are defined, such as facts about an object (a.k.a., entity or item), the
definition of which is anything perceivable or conceivable (ISO 9000:2015). Data origin refers to the origin-based data taxonomy
(Abrams 2014), the categories of which are observed, provided, derived, and inferred that are based on individual awareness concerning
data origin – observed and provided data are as inputs for derived and inferred data. Data can be transformed into meaningful information
through cognitive processing (Baškarada et al., 2013). Insights refer to meaningful information that is defined, such as meaningful data
(ISO 9000:2015) or a combination of data (i.e., a collection of values of measured or derived characteristics of objects) concerning
objects (ISO 20140-5:2017). Nowadays, an insight has one standardized definition (ISO 56000:2020) as a “profound and unique
outcome of the assimilation of information through learning about anything perceivable or conceivable an entity.” Whether there are
real-time requirements or even preventive ones, then actionable insights have to get, for example, by adapting augmented or automated
machine learning and other sophisticated techniques that can be used to amplify cognitive capabilities.
Datafication’s consequences for social freedom require standardized discourses about data (Couldry and Yu 2018). Authority
documents such as the General Data Protection Regulation (GDPR) affect the discourses. However, there is a huge amount of terms
with and without explicit definitions, even with several synonyms. For example, Unified Compliance Framework (UCF) has associated
13 synonyms of data (i.e., archive, background, documentation, dossier, file, input, intelligence, material, proof, record, return,
statement, and statistics) from hundreds of authority documents (Unified Compliance 2020). Compliance or conformance is defined, for
example, “fulfilment of specified requirements” (ISO 2394:2015), for example, to fulfil the requirements of the General Data Protection
Regulation (GDPR) when processing the data of natural persons. Enforcement of compliance with sanctions forces organizations to
control and direct both commissioned data processing and their processes. Therefore, data valuation is facilitated by types of data or
data categories (Mussalo et al. 2018a). Moreover, there are statements and data use categories such as advertise, improve, personalize,
provide, upgrade, and share (ISO/IEC 19944:2017) the definitions of which can be used to illustrate compliant data usage (Figure 1).
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597

Figure 1. Observed or provided data are either information or insights used, for example, for users, customers, target person or individuals (ISO/IEC
19944:2017; Abrams 014)

Compliance depends on effective data governance (Gregory and Hunter, 2011). There are several frameworks (e.g., DGI 2015,
Dahlberg and Nokkala f2015, Standford University 2011) for the data governance, and the maturity assessment of the data governance.
Further, there are many activities (Alhassan et al. 2016) that are included in data governance. However, awareness of data-related
motivation elements, such as policies and requirements, form the basement, both governance and further compliant data usage.
Moreover, the causes and actions that can be established during the critical success factors (CSFs) increase common sense. The
associated actions concretize the meanings of the CFSs (e.g., “employee data competencies,” “clear data processes and procedures,”
“flexible data tools and technologies,” “standardized easy-to-follow data policies,” “established data roles and responsibilities,” “clear
inclusive data requirements,” and “focused and tangible data strategies”) for data governance (Alhassan et al. 2018), for example, the
“employee data competencies” CSF can be established to “increase employee awareness and training,” and the “clear inclusive data
requirements” CFS can be established to “have the right data requirements and comply with regulations.” Data governance and compliant
data usage are many stakeholders’ concerns. The roles of the data governance differ depending on the frameworks where the
responsibilities and accountability of the stakeholders are addresses on the data itself (Walti et al. 2015). Data stakeholders draw up the
codes of conduct intended to contribute to the proper application of the authority documents because the codes of conduct challenge for
the organizations, and the conduct risks will be the most meaningful (Mussalo et al. 2018b). One reason might be that authority
documents define their terms and mandate, and implementation controls called common controls, for example, the GDPR is mapped
into 1497 common controls (Mussalo et al. 2018b). Moreover, most data regulations come from external regulators, for example, due
to cloud computing, and several parties are required to document their measures to guarantee cloud data governance (Al-Ruithe et al.
2019).
When “regulations and data governance remove uncertainty about what can be shared, how, and by whom” (Ransbotham and Kiron
2017), then there should be awareness of data processing possibilities (e.g., data categories and automation) and processing stakeholders.
Awareness of data categories means knowledge of the data and even pipelines associated with downstream, stewardship, and upstream
processes. Within provided data, (e.g., filled applications), individuals are highly aware; within observed data (e.g., recorded clicks and
phones), individuals might be partly aware, within derived data (e.g., time-on-page) individuals are not aware how observed and
provided data are manipulated; within inferred data (e.g., recommendations), the individuals are not aware because of analytical
evaluations. Awareness of processing stakeholders means knowledge of the roles and mandates of the roles to process data. However,
deploying automation means that something is done without direct manual interventions, or even without human interventions.
Therefore, awareness of automation is a crucial part of compliant data usage.
The General Data Protection Regulation (GDPR) might be the most known authority document that has affected both the discourses
about data and data governance. Therefore, the research is based on the procedure where the content of the GRPR articles are coded,
and cognitive services are used to parse sentences in the subject-action-object form (Section 2). We extract semantic roles and terms
from the GDPR articles to figure out whether awareness of data categories, automation, and processing stakeholders are increased and
whether the IBM Watson Natural Language Understanding Text Analysis cognitive service is useful to manifest indicative semantic
roles (Section 3).

2 Material and Methods


The main meaning of the GDPR is to mandate authorities to protect data subjects concerning their personal data processing. The
main stakeholders that are accountable of personal data processing are certification bodies, controllers, monitoring bodies, and
processors: a controller is “the natural or legal person, public authority, agency or other body which, alone or jointly with others,
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
determines the purposes and means of the processing of personal data” (GDPR, Article 4, Paragraph 7); a processor is “a natural or legal
person, public authority, agency or other body which processes personal data on behalf of the controller” (GDPR, Article 4, Paragraph
8); a monitoring body “has an appropriate level of expertise in relation to the subject-matter of the code and is accredited for that purpose
by the competent supervisory authority” (GDPR, Article 41), and a certification body has “an appropriate level of expertise in relation
to data protection shall, after informing the supervisory authority in order to allow it to exercise its powers pursuant to point (h) of
Article 58(2) where necessary, issue and renew certification” (GDPR, Article 43). If certifications bodies, controllers, monitoring bodies,
or processors infringe the provisions of the GDPR, then administrative fines up to 10 MEUR or 20 MEUR, or in the “case of an
undertaking, up to 2% or 4 % of the total worldwide annual turnover of the preceding financial year, whichever is higher” (Article 83).
Article 83 refers for 49 GDPR articles (5-9, 11-22, 25-39, 41-49, 58, 85-91) that define sanction criteria. Therefore, those 49 articles are
selected for research material because they contain mostly inter references (Table 1). However, the content of 49 sanction-based General
Data Protection Regulation (GRPR) articles are researched, and the rest of the GDPR that is comprised of 99 articles and 173 recitals is
used to clarify results.

Table 1. GDPR articles and fines (A = max 10 MEUR or 2 % of turnover, B = max 20 MEUR or 4 % of turnover). There are both 49 articles inter references
and outer references that also contain some other mentioned GDPR articles.
Articles Fines Inter Outer ref.
No Name A B ref.
5 Principles relating to processing of personal data x x
6 Lawfulness of processing x x 10, 23(1)
7 Conditions for consent x x
8 Conditions applicable to child's consent in relation to information society services x x
9 Processing of special categories of personal data x x
11 Processing which does not require identification x x
12 Transparent information, communication and modalities for the exercise the rights of the data subject x x 92
13 Information to be provided where personal data are collected from the data subject x x
14 Information to be provided where personal data have not been obtained from the data subject x x
15 Right of access by the data subject x x
16 Right to rectification x
17 Right to erasure ('right to be forgotten') x x
18 Right to restriction of processing x x
19 Notification obligation regarding rectification or erasure of personal data or restriction of processing x x
20 Right to data portability x x
21 Right to object x x Directive 2002/58/EC
22 Automated individual decision-making, including profiling x x
25 Data protection by design and by default x x
26 Joint controllers x x
27 Representatives of controllers or processors not established in the Union x x 10
28 Processor x x 23, 63, 82-84, 93(2)
29 Processing under the authority of the controller or processor x
30 Records of processing activities x x 10
31 Cooperation with the supervisory authority x
32 Security of processing x x 40
33 Notification of a personal data breach to the supervisory authority x 55
34 Communication of a personal data breach to the data subject x x
35 Data protection impact assessment x x 10, 40, 63, 68,
36 Prior consultation x x
37 Designation of the data protection officer x x 10
38 Position of the data protection officer x x
39 Tasks of the data protection officer x x
41 Monitoring of approved codes of conduct x 40, 57, 63
42 Certification x x 3, 55, 56, 63
55-57, 77-84, 92, 93(2); (EC) No
43 Certification bodies x x
765/2008; EN-ISO/IEC 17065/2012
44 General principle for transfers x x 50
45 Transfers on the basis of an adequacy decision x x 93(2;3)
46 Transfers subject to appropriate safeguards x x 40, 63, 93(2)
47 Binding corporate rules x x 63, 79, 93(2)
48 Transfers or disclosures not authorised by Union law x x 50
49 Derogations for specific situations x x
58 Powers x x 40(5), 60-76, 83
85 Processing and freedom of expression and information x x 10, 23, 24, 40,50, 51-57, 59, 60-76
86 Processing and public access to official documents x
87 Processing of the national identification number x
88 Processing in the context of employment x
Safeguards and derogations relating to processing for archiving purposes in the public interest, scientific
89 x x
or historical research purposes or statistical purposes
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
90 Obligations of secrecy x x
91 Existing data protection rules of churches and religious associations x x 51-57, 59

We research whether the GRPR statements increases awareness of automated data processing, data categories, and processing
stakeholders based on the content of 49 GRPR articles. The term ‘auto’ is searched by the browser, and the articles that have the term
‘auto’ have been listed. Further, the term ‘processing’ is searched because it is defined to contain automation (GDPR Article 4 Paragraph
2). The term ‘data’ and ‘information’ are searched. Moreover, there are verbs such as collect, evaluate, monitor, obtain, profile, and
reveal that refers derived, inferred, observed, or provided data. Awareness concerning automated data processing can be evaluated using
coded statements and related articles. The following definitions of the top-level data categories based on reviewed articles (Mussalo et
al. 2018a): observed – recorded automatically, provided – given by individuals, derived – aggregated from other data, inferred –
generated from correlated data or datasets. Further, data categories aligned data origins are provided by adapting the following rules:
1. IF there are statements the contents of which refer ‘collected’ or ‘obtained’ from the data subject THEN the category is provided
2. IF there is the term ‘monitoring’ THEN the category is observed
3. IF the data subject has given a consent to manifest her data THEN the category is either observed or provided
4. IF there is a term ‘profiling,’ ‘revealing,’ or ‘evaluation’ THEN the category is either derived or inferred.
When we research whether the GRPR statements increases awareness of processing stakeholders based on the content of 49 GRPR
articles, then there are used the cognitive service and content analytics tool as another coder. The cognitive service is used to parse
sentences into semantic roles in the subject-action-object form where subjects and objects refer to potential processing stakeholders.
The versions of the cognitive service are the IBM Watson™ Natural Language Understanding service (IBM 2018) and the IBM Watson
Natural Language Understanding Text Analysis service (IBM 2020). The main difference of the versions is the representations of the
semantic roles - the IBM Watson™ Natural Language Understanding service underlines and annotates the semantic roles whereas the
IBM Watson Natural Language Understanding Text Analysis service offers the tabulated subject-action-object form. The IBM Watson
Natural Language Understanding Text Analysis service (abbreviated hereafter the NLU service) is used to generate the subject-action-
object forms of the selected 49 GDPR articles. The number of the parsed paragraphs and amount of their subject-action-object forms,
as well as the numbers of the distinct subjects, actions, and objects, are calculated based on the tabulated subject-action-object forms
(Table 2). The Atlas.ti content analysing tool is used to code stakeholder instances from the Excel file, the columns of which are article,
paragraph, subject, action, and object. Atlas.ti (2020) codes the columns automatically by the names of the columns, which helps to
validate potential stakeholders before they are autocoded to as well as explore co-occurrences.

Table 2. Examples of subject-action-object forms (Article 7) and stakeholder instances (italicised) in the subjects or objects
Subject Action Object
processing is based on consent
processing is based on consent
the controller to demonstrate that the data subject has consented to processing of his or her personal data
the controller processing of his or her personal data
the data subject's consent is given in the context of a written declaration which also concerns other matters
a written declaration concerns other matters
consent be presented
The data subject shall have the right to withdraw his or her consent at any time
The data subject to withdraw his or her consent
The withdrawal of consent affect the lawfulness of processing based on consent before its withdrawal
the data subject be informed thereof
the data subject shall be informed thereof
It to give consent
utmost account shall be taken
the provision of a service including inter alia, the performance of a contract
the processing of personal data that is not necessary for the performance of that contract

The most prominent processing stakeholders (Table 3) are obtained by WordArt that has been used to obtain the word frequencies
of the GDPR (Ataei et al. 2018). After that, the instances of the potential processing stakeholders are either generalized or specialized
(e.g., from ‘supervisor’ to ‘European Data Protection Supervisor’) based on the instances and word frequencies of the GDPR. Potential
processing stakeholder categories and their mapping stamps in parentheses are authorities (A), data objects (DS), and data protection
officers (DPO) as well as certification bodies (CB), controllers (C), monitoring bodies (MB), and processors (P). The potential
processing stakeholders are potential infringers for the provisions of the GDPR based on Article 83.
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
Table 3. Potential processing stakeholders and their categories as well as preliminary mapping categories for subjects and objects manifested by NLU
service (C=controller, P=Processor, DPO=Data Protection Officer, MB=Monitoring Body, A=Authority, CB=Certification Body, DS=Data Subject)
Words by WordArt Stakeholder instances in the GDPR Potential processing stakeholder categories Preliminary mapping categories
Agencies 4 Agenc(y)/(ies) 8 A, C, P A
Associations 14 Association(s) 22 A, C, P A
Auditor(s) 1 A, C A
Authority 379 Authorit(y)/(ies)/(y's) 563 A A
Board 126 Board('s) 129 A A
Body 58 A
x Bod(y)/(ies) 114 A, CB, MB
Bodies 53 CB (if certification body)
Chair 19 Chair(s) 22 A A
Child 16
Child(ren)/('s) 25 DS DS
Children 8
Commission 112 Commission 113 A A
Committee 9 Committee 9 A A
Controller 432 Controller(s)/('s) 505 C C
Council 55 Council 55 A A
Country 82 x Countr(y)(ies) 119 A A
Court 37 Court(s) 59 A A
Subjects 121 Data subject(s)/('s)/(s') 406 DS DS
Employees’ 4 Employee(s)/(s') 10 A, C, P, DS DS
Enterprises 21 Enterprise(s) 23 C, P C, P
Supervisor 12 European Data Protection Supervisor 12 A A
Parliament 59 European/national parliament 60 A A
Holder(s) 6 DS DS
Institutions 11 Institution(s) 11 A, C, P A
State 235 Member State 309 A A
Officer 32 Data protection officer 33 DPO DPO
Organisation 61 Organisation(s) / (s') / ('s) 99 C, P C, P
Party 11 Part(y)/(ies) 18 A, C, P, DS C, P
Persons 109 Person(s)/('s) 186 A, C, P, DS DS
Processor’s 6 Processor(s)/('s) 264 P P
Providers 4 Provider(s) 4 C, P C, P
Recipient 10 Recipient(s) 26 C, P C, P
Representatives 6 Representative(s) 27 A, C, P C, P
Secretariat 7 Secretariat 7 A A
Staff 18 Staff 18 A, C, P C, P
Stakeholder(s) 2 A, C, P A
Undertaking 16 Undertaking(s) 44 A, C, P C, P
Union 220 Union 221 A A

At the same time, when subjects or objects are manifested by the NLU service that manifests indicative semantic roles, 49 articles
are evaluated (Appendix A) by the human interpreter using the following rules:
1. IF a subject or object is one of the article alternative points THEN the object or subject is (e)
2. IF someone (e.g., an authority, a data subject) did not request something THEN the object or/and the subject is (e)
3. IF something is impossible THEN the object or/and subject is (e)
4. IF there is an auxiliary verb (e.g., may) THEN the object or/and the subject is (e)
5. IF there is an IF statement THEN the object or/and the subject is (e)
6. IF there is a conditional statement (e.g. 'where appropriate') THEN the object or/and the subject is (e)
7. IF there is a conditional shall not apply AND a subject or object WITH a defined variable AND unless THEN the object or/and
the subject is (e)
8. IF there is a conditional statement (e.g. 'in such cases,' 'in cases of ') THEN the object or/and the subject is (i)
9. IF there is an IF statement AND a role (e.g., a controller, a processor) IS NOT an object or/and a subject THEN the role is (i)
10. IF a subject AND an object (e.g., rules such as collective agreements or law) is not a role AND an action an auxiliary verb (e.g.,
may) effect on a stakeholder role (e.g., employee) THEN the role is (i)
11. IF a subject is an intermediator (e.g., a certification body) to someone (e.g., a controller, a processor) that is a subject THEN an
object (e.g., a data subject) is ‘i'
12. IF a subject (e.g., safeguards) is not a role AND it shall do something THEN a possible role related to subjects are ‘i'
13. IF an object/subject is replaced with an indefinite role (e.g., staff) THEN the object or/and subject is ‘i'
14. IF not rules 1-13 THEN the object or/and subject is ‘e’

Finally, human interpretation results are compared within the semantic roles that are manifested by the NLU service (Table 4).
Intercoder reliability is verified by Krippendorff’s Alpha (Neuendorf, 2002). The target value is assumed to be more than 0,5 when one
(1) indicating perfect intercoder reliability. The mathematical formula 1-(((m*n-1)/(m-1))*(pfu/pmt)) for Krippendorff’s Alpha uses pfu
= product of any frequencies for a given unit that is different, pmt = each product of total marginals, m = number of coders, and n =
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
number of units coded in common by coders. The number of coders is two (2) – one is the NLU service, and another the human
interpreter. The number of units is 49 that is the number of the coded GDPR articles. The Differ column illustrates articles that coding
results differ, and it is used to calculate pfu. The BOTH row illustrates similarly coded stakeholder categories, and it is used to calculate
pmt.

Table 4. Semantic roles (S=Subject, O=Object) of processing stakeholders (C=controller, P=Processor, DPO=Data Protection Officer, MB=Monitoring
Body, A=Authority, CB=Certification Body, DS=Data Subject) by using the NLU service (8 columns on the left), awareness evaluation (e=explicit,
(e)=conditional explicit, i=implicit, (i)=conditional implicit) of the processing stakeholders by the human interpreter (8 columns on the right), and coding
disagreements for articles (the Differ column). * means that either the NLU service or human interpreter has found its subject or object.
Articles C P DPO MB A CB DS Differ Articles C P DPO MB A CB DS
5 5
6 6
7 S S, O 7 e e
8 8
... ...
91 S* O O* 1 91 e
Σ 43 21 10 1 37 3 40 11 Σ 44 23 10 1 40 3 31
OWN* 1 9 OWN* 2 2 3
BOTH 42 21 10 1 37 3 31

4 Results
4.1 Awareness of automation

The GDPR treat word ‘processing’ as “means any operation or set of operations which is performed on personal data or the sets of
personal data, whether or not by automated means” (GDPR, Article 4 Paragraph 2). All 49 articles of the GDPR concern processing and
seven articles (13, 14, 15, 20, 22, 35, and 47) concern automation or automated data processing. Profiling is defined to mean any form
of automated processing.
There exists conditional awareness of data subject, which means the rights of the data subject or available information. The rights
of the data subject (because of being able to use his / her rights depend on whether the data subject is aware of processing and access to
information) does not guarantee the reception of information. The following articles might increase awareness of automation for data
subjects even they do not have provided the personal data:
• Article 13 (Information to be provided where personal data are collected from the data subject), Article 14 (Information to be
provided where personal data have not been collected from the data subject), and Article 15 (Right of access by the data subject)
contain one similar point “the existence of automated decision-making, including profiling, referred to in Article 22(1) and (4) and,
at least in those cases, meaningful information about the logic involved, as well as the significance and the envisaged consequences
of such processing for the data subject” (Article 13 Paragraph 2, Point f; Article 14 Paragraph 2, Point g; Article 15 Paragraph,
Point h)
• Article 20 (Right to data portability) has one point “[t]he data subject shall have the right to receive the personal data concerning
him or her … and have the right to transmit those data to another controller without hindrance from the controller to which the
personal data have been provided, where …(b) the processing is carried out by automated means” (Article 20 Paragraph 1, Point b)
• Article 22 (Automated individual decision-making, including profiling) has one paragraph “the data subject shall have the right not
to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him
or her or similarly significantly affects him or her” (Article 22 Paragraph 1)
• Article 47 (Binding corporate rules) contains one point “the rights of data subjects about processing and the means to exercise those
rights, including the right not to be subject to decisions based solely on automated processing, including profiling by Article 22, the
right to lodge a complaint with the competent supervisory authority and before the competent courts of the Member States by Article
79, and to obtain redress and, where appropriate, compensation for a breach of the binding corporate rules” (Article 47 Paragraph
2, Point e)
Article 22 increases also awareness of automation for the controller because automated individual decision-making (including
profiling) is allowed when it is “necessary for entering into, or performance of, a contract between the data subject and a data controller”;
“is authorised by Union or Member State law to which the controller is subject and which also lays down suitable measures to safeguard
the data subject's rights and freedoms and legitimate interests”; or “is based on the data subject's explicit consent.” However, “personal
data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing
of genetic data, biometric data to uniquely identify a natural person, data concerning health or data concerning a natural person's sex life
or sexual orientation shall be prohibited” (Article 9 Paragraph 1) to use in the automated individual decision-making process “unless
point (a) or (g) of Article 9(2) applies and suitable measures to safeguard the data subject's rights and freedoms and legitimate interests
are in place” (Article 22 Paragraph 4). The Data Protection Impact Assessment (DPIA) is requested from the controllers prior to the
processing “where a type of processing, in particular, using new technologies, and taking into account the nature, the scope, context,
and purposes of the processing, is likely to result in a high risk to the rights and freedoms of natural persons” (Article 35 Paragraph 1).
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
The controller is also requested to seek the advice of the data protection officer (where designated) about carrying out DPIA. Awareness
of automation for the controller is based on Article 35 (Data protection impact assessment) that contains one point “a systematic and
extensive evaluation of personal aspects relating to natural persons which are based on automated processing, including profiling, and
on which decisions are based that produce legal effects concerning the natural person or similarly significantly affect the natural person”
(Article 35 Paragraph 3, Point a) where insignificantly affect the natural person is defined as “in particular analysing or predicting
aspects concerning performance at work, economic situation, health, personal preferences or interests, reliability or behaviour, location
or movements, in order to create or use personal profiles” (Recital 71).
The supervisory authority may establish and make public a list of the kind of processing operations which are subject to the
requirement for a DPIA (Article 35 Paragraph 4). The awareness of processing and awareness of automation the role of supervisory
authority is essential, since (s)he is in the position to approve binding corporate rules (Article 58 Paragraph 3 Point j), estimate the high
risks of processing and (s)he has the powers to effect on the processing as well as to order such as the “controller or the processor to
comply with the requests of the data subject to exercise his or her rights pursuant to this Regulation” (Article 58 Paragraph 2 Point c) as
well as the inform the data subject, about a personal data breach (Article 34 Paragraph 4).
4.2 Awareness of data categories

From the sanction-based articles, eight take a stand for ‘what is the origin of the personal information.’ There is one article concerning
all data categories (Article 9), three articles for observed data (Article 14, 35 and 37), and two articles for provided data (Article 13, and
20). There are five articles concerning derived or inferred data (9, 14, 15, 21, and 35). Provided data are created by direct actions taken
by the data subject, and observed data are usually recorded without the consciousness of the data subject. However, the data subjects
are usually not aware of how observed and provided data are manipulated (i.e., how derived and inferred data are generated).
Articles 9 (Processing of special categories of personal data) has two points indicating observed or provided data: the data subject
is aware based on “has given explicit consent” (Paragraph 2 Point a) or possibly aware based on “manifestly made public by the data
subject” (Paragraph 2 Point e). The term ‘revealing’ indices the derived or inferred data: “Processing of personal data revealing the
personal data of special categories is prohibited” (Paragraph 1).
Article 13 (Information to be provided where personal data are collected from the data subject) contains three paragraphs where
are the terms indicating the provided data that implicitly indicates that data are provided: the data subject is aware is based on
“…collected from the data subject… at the time when personal data are obtained” (Paragraph 1), “[i]n addition to the information
referred to in paragraph 1, the controller shall, at the time when personal data are obtained, provide the data subject” (Paragraph 2), and
“...controller intends to further process the personal data for a purpose other than that for which the personal data were collected”
(Paragraph 3).
Article 14 (Information to be provided where personal data have not been collected from the data subject) has two statements that
indicate observed data: “data have not been obtained from the data subject” (Paragraph 1) and “the controller intends to further process
the personal data for a purpose other than that for which the personal data were obtained” (Paragraph 4). The term ‘profiling’ indicates
the derived or inferred data outcome of the process can be derived or inferred, and the data subject is not probably aware but becomes
aware when the controller provides information (Paragraph 2 Point g). Furthermore, the controller states that the controller shall provide
information “from which source the personal data originate, and if applicable, whether it came from publicly accessible sources”
(Paragraph 2 Point f).
Article 15 (Right of access by the data subject) has only one statement where the term ‘profiling’ is used to indicate the origin of
the data are derived or inferred. The data subject shall have access to the personal data; in other words, the data subject will become
aware of this derived or inferred data on request. The controller has to provide the data subject information about “the existence of
automated decision-making, including profiling … and meaningful information about the logic involved, as well as the significance and
the envisaged consequences of such processing” (Paragraph 1 Point h).
Article 20 (Right to data portability) has one statement referring to provided data: “The data subject has the right to receive the
personal data concerning him or her, which he or she has provided to a controller” (Paragraph 1).
Article 21 (Right to object) has two statements (Paragraphs 1 and 2) where the term ‘profiling’ indicates derived or inferred data.
The data subject has the right to object to processing her data for direct marketing, scientific, historical research, or statistical purposes
(Paragraph 4 and 6).
Article 35 (Data protection impact assessment). There is the term ‘monitoring’ that indicates observed data – “a systematic
monitoring of a publicly accessible area on a large scale” (Paragraph 3 Point c). There are the terms ‘evaluation’ and ’profiling’ referring
to derived or inferred data (Paragraph 3 Point a). The data subject will be aware of data protection impact assessment if the controller
seeks “the views of data subjects or their representatives on the intended processing” (Paragraph 9).
Article 37 (Designation of a data protection officer is required by the controller and the processor) has one statement containing
the term ‘monitoring’ that indicated observed data when used “regular and systematic monitoring of data subjects on a large scale”
(Paragraph 1 Point b). If there is a designated data protection officer, the data subject might be monitored. The data subject has to act
for awareness, and (s)he may contact the data protection officer with all issues under the regulation and exercise of their rights (Article
38 Paragraph 4),

4.3 Awareness of processing stakeholders


Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597

There are seven paragraphs (Article 25 Paragraph 1, Article 31 Paragraph 1, Article 35 Paragraph 10, Article 38 Paragraph 5, Article
85 Paragraph 1 and 2, Article 88 Paragraph 1) do not have semantic roles by the NLU service. The descriptive scores are the number of
the parsed paragraphs and amount of their subject-action-object forms as well as the numbers of the distinct subjects, actions, and objects
are calculated based on the NLU service (Table 5). Preliminary mapping categories (Table 3) are used when processing stakeholder
categories are indicated based on the coded stakeholders by the Atlas.ti (Table 6).

Table 5. Descriptive scores and indicative semantic roles as well as mapping into processing stakeholder categories based on Table 6
Paragraphs Forms Subjects Actions Objects Indicative semantic roles C P DPO A CB DS
Article S O S O S O S O S O S O
2 12 9 7 11 5 x x
4 29 18 19 21 6 x x x x x x
4 16 11 14 16 7 x x x
3 9 7 8 8 8 x x x x x
4 27 18 17 25 9 x x x x x x
2 13 8 12 11 11 x x x x
8 42 19 34 38 12 x x x x x
4 24 13 17 20 13 x x x x
5 38 21 26 34 14 x x x x x x x x
4 18 14 18 18 15 x x x x x x x
1 4 2 3 4 16 x x
3 21 14 18 20 17 x x x x
3 13 11 13 13 18 x x x x
1 3 2 3 3 19 x x
4 18 11 16 18 20 x x x x x
6 24 13 16 19 21 x x x
4 10 8 9 10 22 x x x x
2 6 5 6 6 25 x x
3 14 10 11 13 26 x x x x x
5 14 10 12 13 27 x x x x x x
10 60 45 50 53 28 x x x x x x x x
1 3 3 3 3 29 x x x
5 15 12 8 12 30 x x x x x x x
4 12 10 11 11 32 x x x x x x
5 18 12 15 17 33 x x x x x x
4 20 12 18 18 34 x x x x
10 33 24 25 30 35 x x x x x x x x x
5 27 21 24 23 36 x x x x x x x x
7 22 12 18 21 37 x x x x x x x x x
5 13 7 12 12 38 x x x x x x
2 11 11 10 11 39 x x x x x x x x
6 20 17 19 20 41 x x x x x x x
8 28 22 17 28 42 x x x x x x x x
9 33 22 31 33 43 x x x x x x x x x
1 10 7 9 10 44 x x x x x x
9 28 17 21 25 45 x x x x x x x
5 16 11 10 13 46 x x x x x
3 32 29 21 30 47 x x x x x x x x
1 5 3 4 4 48 x x x
6 28 20 23 26 49 x x x x x x x x
6 31 24 19 31 58 x x x x x x x x x
1 2 2 2 2 85
1 5 5 5 5 86 x x
1 1 1 1 1 87 x
2 3 3 3 3 88 x x x
4 24 17 14 19 89 x x x
2 8 7 8 8 90 x x x x x x
2 7 7 6 6 91 x x x
11 870 413 350 655 Column total 42 30 25 20 4 3 30 25 3 3 28 33
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
Table 6. Number of autocoded terms from the tabulated subjects (Ss) and objects (Os) by Atlas.ti as well as stakeholder category (Cat.) mappings and listed
GRPR articles that contain terms.
Potential stakeholder instances (whole GDPR) Ss Os Cat. S articles (GDPR) O articles (GDPR)
Association(s) 22 3 1 A 37, 91 37
Auditor(s) 1 3 1 A 28 28
6, 17, 20, 28-29, 32, 35-37, 41-43, 46-47, 49, 8-9, 12, 20, 28, 32-34, 36-37, 41-43, 45-47, 49, 58,
Authorit(y)/(ies)/(y's) 563 48 33 A
58, 86, 90-91 86, 90
Board('s) 129 3 A 41, 42, 47
A A: 9, 37, 41, 86, 90 A: 9, 37, 41
x Bod(y)/(ies) 114 25 20
CB CB: 42-43, 58 CB: 42-43, 58
Child(ren)/('s) 25 3 2 DS 8 8
Commission 113 24 A 12, 28, 43, 45-46
5-8, 11-15, 17-22, 25-39, 41-44, 46-49, 58, 6, 9, 11, 13-18, 20, 22, 26-28, 30, 32-33, 35-37, 39,
Controller(s)/('s) 505 181 74 C
90 41-44, 49, 58, 90
x Countr(y)(ies) 119 10 10 A 28, 30, 44-45, 47-49 14-15, 28, 30, 45
Court(s) 59 4 A 37, 47
6-7, 9, 11-18, 20-22, 26, 33-35, 38, 45-47, 49, 5-7, 9, 11-15, 18-19, 21-22, 26-28, 34-37, 41, 43, 47,
Data subject(s)/('s)/(s') 406 62 78 DS
88-89 49, 58, 89
Employee(s)/(s') 10 2 2 DS 39, 47 39
Enterprise(s) 23 7 3 C, P 42, 47, 88 30, 47
European/national parliament 60 1 1 A 36 36
Holder(s) 6 1 1 DS 8 8
6, 8-9, 14, 26, 28-29, 36-37, 43, 47, 49, 58, 6, 8-9, 14, 27-28, 35, 39, 49, 90
Member State 309 32 16 A
87, 89-90
Data protection officer 33 13 9 DPO 37-39 35, 37-38
Organisation(s) / (s') / ('s) 99 12 20 C, P 28, 30, 44-45, 49 14-15, 28, 30, 45
Person(s)/('s) 186 6 8 DS 28, 35, 91 25, 30, 33-35, 44, 91
Processor(s)/('s) 264 80 38 P 27, 28-33, 35-39, 41-44, 46-49, 58, 90 27, 28, 32, 35-39, 41-44, 58, 90
Recipient(s) 26 2 3 C, P 14, 15 14, 49, 58
Representative(s) 27 8 3 C, P 27, 30, 37, 58 27, 35
Staff 18 2 1 C, P 39 37
Undertaking(s) 44 3 2 C, P 37, 47, 88 47
Union 221 16 14 A 6, 9, 14, 26, 28, 29, 37, 47, 49, 58, 89 6, 9, 14, 28, 39, 49, 90

The result of the crosscheck of the semantic rules (Table 7) and human interpretation is consistent, the exceptions that cause the
conditionality and implicit rules are represented in Appendix A marked with <x>, where x is i, (e) or (i). During human interpretation,
the following role mappings have been done:
• Data subject – employee (Articles 9 and 88), human (Articles 45 and 88), person(s) (Articles 9, 12, 18, 32, 35, 44, 49 and 91), child
(Articles 6, 8 and 12), and holder (Article 8)
• Controller – enterprise (Articles 30, 42, 47 and 88), person(s) (Article 9, 29, 30), human (Article 22), undertaking (Article 36, 37,
47 and 88), staff (Article 39), churches and religious associations or communities (Article 91)
• Processor – person(s) (Articles 28, 29), employee (Article 39 and 47), undertaking (Article 37), staff (Article 39)
• Data protection officer – person (Article 47), staff (Article 37)
• Body (Article 41) into a monitoring body (based on Article 83), otherwise a body without a prefix is mapped into an authority.
• Authorities within prefixes (e.g., administrative, juridical, official, public, and supervisor) into an authority.
The differences between coders (Table 8) can be explained by the NLU service point of view either concerning parsed results or
parsed result coding as the following ways:
• Article 14 and Article 15 – Recipient was coded to belong into the categories C (controllers) and P (processors)
• Article 18, Article 22, Article 38, and Article 88 – Authority-related stakeholders were not parsed by the NLU service
• Article 30, Article 33, and Article 47 – Data protection officer was not parsed by the NLU service
• Article 31 and Article 85 – Semantic roles were not parsed by the NLU service
• Article 36 – Data subject was coded from the parsed objects
• Article 41 – Monitoring body was not codable
• Article 42 – Data object was not parsed by the NLU service
Krippendorff’s Alpha is 0,845837367= 1–(((2*49)/(2-1))*(13/8264)) the calculations of which is based on the mathematical formula
1-(((m*n-1)/(m-1))*(pfu/pmt)) where m = 2 (i.e., the cognitive service and human interpreter), n = 49 (i.e., the number of the articles),
pfu = 13 (Table 4 - the Differ column), pmt = 8264 = [(43*23) + (43*5) + (43*0) + (43*35) + (43*3) + (43*38) + (23*5) + (23*0) +
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
(23*35) + (23*3) + (23*38) + (5*0) + (5*35) + (5*3) + (5*38) + (0*35) + (0*3) + (0*38) + (35*3) + (35*38) + (3*38)] (i.e., all pairings
of the similarly coded stakeholder categories, Table 4 - the BOTH row).

Table 7. Applied rules for awareness of the processing stakeholders (C=controller, P=Processor, DPO=Data Protection Officer, MB=Monitoring Body,
A=Authority, CB=Certification Body, DS=Data Subject) by the human interpreter
Articles C P DPO MB A CB DS
5 14 14
6 14 14 1
7 14 14
8 14 14 14
9 14, 9 14 14, 1
11 3 3
12 14, 3, 4 14 14
13 14 14 14
14 14, 3 14 6 14, 8
15 14 14, 2
16 2 2
17 14 14 2
18 14 14 14
19 3 2
20 14 14 2
21 14 14 14
22 14 14, 5 14, 2
25 14
26 14 14 4
27 14 14 14 1
28 14 14 4, 14 2
29 14 14 14
30 14, 7, 12 14, 7 12 2 12
31 14 14 14
32 14 14 2
33 14 14 12 14
34 14 14 14, 5
35 14 14 14 14 6
36 14 14 14 14
37 14 14 14, 4 14, 4
38 14 14 14 14 14
39 14, 13 14, 13 14 14
41 8 8 14 14
42 14 14 14 14 4
43 14 14 14, 4 14 11
44 14 14
45 14 14
46 14 14 14, 4
47 14 14 14 14, 2 14
48 9 9 14
49 14 14 14 14
58 14 14 14 14 2
85 14
86 14
87 4
88 10 14, 4 10
89 12 12 4
90 14, 4 14, 4 14, 4
91 14
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
Table 8. Semantic roles (S=Subject, O=Object) of processing stakeholders (C=controller, P=Processor, DPO=Data Protection Officer, MB=Monitoring
Body, A=Authority, CB=Certification Body, DS=Data Subject) by the NLU service (8 columns on the left), awareness (e=explicit, (e)=conditional explicit,
i=implicit, (i)=conditional implicit) of the processing stakeholders by the human interpreter (8 columns on the right), and coding disagreements for articles
(the Differ column)
Art, C P DPO MB A CB DS Differ Art. C P DPO MB A CB DS
5 S O 5 eS eO
6 S, O S, O S, O 6 eS, eO eS, eO (eS), eS, eO
7 S S, O 7 eS eS, eO
8 S S, O S, O 8 eS eS, eO eS, eO
9 S, O S, O S, O 9 eO, (i) 1
eS, eO eS, eO, (eO)
11 S, O S, O 11 eS, eO, (eS) (eO)
12 S S, O S, O 12 eS, (eS), (eS) eS eS, eO,
13 S, O S, O 13 eS, eO
14 S, O S, O* S, O S, O 1 14 eS, eO, (eS) eS, (eO) eO, eS, (iO)
15 S, O S, O* O S, O 1 15 eO eO eS, (eS)
16 O S 16 (eO) (eS)
17 S, O S S 17 eS, eO eS (eS)
18 S, O S, O 1 18 eS, eO eO* eS, eO
19 S O 19 (eS) (eO)
20 S, O S, O S 20 eS, eO eS (eS)
21 S S, O 21 eS eS, eO
22 S, O S, O 1 22 eS, eO (eS), (eO)* e, (eS)
25 S O 25 eS eO
26 S, O S S, O 26 eS, eO eS (eO)
27 S, O S, O O O 27 eS, eO eS, eO eO (eO)
28 S, O S, O S, O S, O 28 eS, eO eS, eO (eS), eO (eO)
29 S S S 29 eS eS eS
30 S, O S, O S, O O 1 30 eS, (eO), iO eS, (eO) iO* (eO) iO
31 1 31 eS* eS* eO*
32 S, O S, O S, O 32 eS. eO eS, eO (eO)
33 S, O S O S, O 1 33 eS, eO eS iO* eO eO
34 S O S, O 34 eS eS eS, (eO)
35 S, O S, O O S, O S, O 35 eS, eO eO eO eS, eO (eO)
36 S, O S, O S S, O O* 36 eS, eO eS, eO eO eS, eO
37 S, O S, O S, O S, O O 37 eS, eO eS, eO (eS), eO, (eO) eS, eO (eS) eO
38 S S, O S, O O 1 38 eS, eO eS, eO eS, eO eO* eS
39 S, O S, O S O S, O 39 eS, iO eS, iO eS eO eS, eO
41 S, O S, O S, O O 1 41 eS, (iS), eO eS, (iS), eO eS* eS, eO eO
42 S, O S, O S, O S, O 1 42 eS, eO eS, eO eS, eO eS (eS)*
43 S, O S, O S, O S, O O 43 eS, eO eS, eO eS, eO, (eS) eS, eO iO
44 S, O S, O S O 44 eS eS eS, eO eO
45 S, O S, O S, O S 45 eS, eO eS, eO eS, eO eS, eO
46 S S S, O S 46 eS eS eS, (eS), eO eS
47 S, O S, O S, O S, O 1 47 eS, eO eS, eO eO* eS, (eO) eS, eO
48 S S S 48 (i) (i) eS
49 S, O S, O S, O S, O 49 eS, eO eS, eO (eS), eO eS, eO
58 S, O S, O S, O S, O O 58 eS, eO eS, eO eS, eO eS, eO (eO), eO
85 1 85 eS, eO*
86 S, O 86 eS, eO
87 S 87 (eS)
88 S S S 1 88 (iS) (iS) (eS), eO* eS, (iO)
89 S S, O 89 i2 i (eS) eS, eO
90 S, O S, O S, O 90 eS, (eO) eS, (eO) (eS), eO
91 S S, O 91 eS eS, eO
Σ 43 25 5 0 35 3 40 13 Σ 44 24 8 1 41 3 40
OWN* 2 1 OWN* 1 1 3 1 6 1
BOTH 43 23 5 0 35 3 38

1
See Rule #9 & Appendix A, Articles 9 and 48
2
See Rule #12 & Appendix A, Article 89
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
4 Conclusion and Discussion
We categorized 49 GDPR articles to assess data categories, processing stakeholders, and automated data processing in the GDPR
context. All 49 GDPR articles concern processing stakeholders and data processing, eight articles (9, 13, 14, 15, 20, 21, 35 and 37)
concern data categories, and seven articles (13, 14, 15, 20, 22, 35, and 47) concern automation. There seems to be the asymmetry of
power between controllers and processors (Ducato 2016). GDPR does not add awareness of data, and it sets the obligations that need to
carry out to fulfil the compliance. These obligations are aimed to protect personal information; they involve rights and powers that could
be used to enhance the level of awareness and further to reach out to the protection guaranteed by the law. Therefore, GDPR does not
add awareness of data stakeholders, and it sets the obligations that need to carry out to fulfil the compliance. It could be compared to an
individual right to vote in a way to involve in policies. Awareness is a prerequisite for exercising rights, especially when the law allows
the use of personal data. When we compare the awareness of the stakeholders with automation or automated and further processing, the
controllers and data subjects are aware of the processing. Still, the awareness of the processor is only highlighted in Article 47 (Binding
corporate rules). However, Article 35 (Data protection impact assessment) sets the processor accountability and responsibility of
compliance with the approved codes of conduct. The importance of the consent and manifestly published content are significant for the
awareness of the data subjects concerning their essence in processing as well in the context of automated and further processing. The
awareness is refined when the data objects are active to use their rights concerning available information as a consequence of existing
conditional awareness. The GDPR does not add awareness of data, data categories, or insights. Further, there are examples of data
processing (Figure 2) without their outcomes.

Figure 2. Examples of the GDPR data processing

The scope of the chosen articles followed the sanction base that underpins the articles on this Regulation and constructs the minimum
compliance with the core of the regulation by the GDPR. Furthermore, this is the minimum awareness required by the compliance
without sanctions. These mapping issues have been brought up to illustrate relations of the minimum required awareness of processing
without consequences. The significance of the automation in the relation of awareness means that something can be processed without
awareness; the data categories take a stand for what is the origin of the personal information. Therefore, they offer essential information
related to the awareness of the data processing and circumstances of data collection, as well as the essential part of the awareness is the
processing stakeholders, in the meaning of who could be aware of the processing required by law. The formal rules have been
constructed, and results have been brought up transparent. All functions have been presented so that the traceability of the results
remains. The created rules have brought formalism to the review and presentation of the articles while maintaining the original form,
context, and contents. In general, the framework could be expanded and tested in other authority documents. For example, in e-
government, that requires advanced tools to ensure transparency (Alguliyev et al., 2019). In the case of the framework expansion of the
other authority documents, the evaluation and mapping rules need to be supplemented with the terms and statements of the authority
document.
The validity of the study is good for definitions and mappings because the arguments use scientific sources and the GRPR. As well
as, we created evaluation rules; the rules shall work the same for others, and therefore they add the validity and reliability for the
approach and results. Intercoder reliability has been verified with Krippendorff’s Alpha, the value of which was 0,85 that is lower than
one (1), indicating perfect reliability. Awareness of data categories, automation, and processing stakeholders pursuant to 49 GDPR
articles. However, different researchers can give different estimates depending on the interpretations of non-explained terms. Reliability
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
is good when looking at the choice of the material. Moreover, 49 GDPR articles are parsed with the IBM Watson Natural Language
Understanding Text Analysis cognitive service.

References
Abrams M. (2014). ‘The Origins of Personal Data and its Implications for Governance.’ OECD Expert Roundtable Discussion, Available
at SSRN 2510927, https://b1f.827.myftpupload.com/wp-content/uploads/2020/04/Data-Origins-Abrams.pdf

Alguliyev R. M., Niftaliyeva G. Y. (2019). 'Extracting social networks from e-government by sentiment analysis of users’ comments.'
Electronic-Government, 15(1):91–106

Ataei M., Degbelo A. Kray C., Santos V. (2018). ‘Complying with privacy legislation: from legal text to implementation of privacy-
aware location-based services.’ ISPRS International Journal of Geo-Information, 7(11):442; doi:10.3390/ijgi7110442

Alhassan I., Sammon D., Daly M. (2016). ‘Data governance activities: an analysis of the literature.’ Journal of Decision Systems, 25:64–
75

Alhassan I., Sammon D., Daly M. (2018). ‘Data governance activities: A comparison between scientific and practice-oriented literature.’
Journal of Enterprise Information Management.

Al-Ruithe, M., Benkhelifa, E., & Hameed, K. (2019). ‘A systematic literature review of data governance and cloud data governance.’
Personal and Ubiquitous Computing, 23(5-6):839–859.

Atlas.ti. (2020). ‘ATLAS.ti 8 Windows User Manual.’ http://downloads.atlasti.com/docs/manual/atlasti_v8_manual_en.pdf

Baškarada S. and Koronios A. (2013). ‘Data, information, knowledge, wisdom (DIKW): a semiotic theoretical and empirical exploration
of the hierarchy and its quality dimension.’ Australasian Journal of Information Systems, 18(1): 1–24.

Couldry N., Yu J. (2018). ‘Deconstructing datafication’s brave new world.’ New Media & Society, 20(12):4473–4491,
https://doi.org/10.1177/1461444818775968

Dahlberg T., Nokkala, T. (2015). ‘A framework for the corporate governance of data – theoretical background and empirical evidence.’
Business, management and education, 13(1):25–45

DGI (2015). ‘The DGI Data Governance Framework.’ http://www.datagovernance.com/the-dgi-framework/

Ducato R. (2016). ‘Cloud computing for s-health and the data protection challenge: Getting ready for the General Data Protection
Regulation.’ IEEE International Smart Cities Conference (ISC2), Trento, pp. 1–4.

European Commission - GDPR (2016). ‘Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on
the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing
Directive 95/46/EC (General Data Protection Regulation).’ http://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32016R0679

Gregory A., Hunter K. (2011). ‘Data governance — Protecting and unleashing the value of your customer data assets.’ Journal of Direct,
Data and Digital Marketing Practice, 13(1):40–56

IBM (2018). ‘Natural Language Understanding.’ https://natural-language-understanding-demo.ng.bluemix.net

IBM (2020). ‘IBM Watson Natural Language Understanding Text Analysis.’ https://www.ibm.com/demos/live/natural-language-
understanding/self-service/home

ISO/IEC 2382:2015 (2015). ‘Information technology — Vocabulary.’ https://www.iso.org/obp/ui/#iso:std:iso-iec:2382:ed-1:v1:en:en/

ISO 2394:2015 (2015). ‘General principles on reliability for structures.’ https://www.iso.org/obp/ui#iso:std:iso:2394:ed/

ISO/IEC 19944:2017 (2017). ‘Information technology — Cloud computing — Cloud services and devices: Data
flow, data categories and data use.’ https://www.iso.org/obp/ui#iso:std:iso-iec:19944:ed-1:v1:en
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
ISO 9000:2015. ‘Quality management systems. Fundamentals and vocabulary.’ https://www.iso.org/obp/ui#iso:std:iso:9000:ed-
4:v1:en:term:3.8.1

ISO 20140-5:2017. ‘Automation systems and integration — Evaluating energy efficiency and other factors of manufacturing systems
that influence the environment — Part 5: Environmental performance evaluation data.’ https://www.iso.org/obp/ui#iso:std:iso:20140:-
5:ed-1:v1:en:

ISO 56000:2020. ‘Innovation management — Fundamentals and vocabulary.’ https://www.iso.org/obp/ui#iso:std:iso:56000:ed-1:v1:en:

Mussalo P., Gain U., Hotti V. (2018a). ‘Types of Data Clarify Senses of Data Processing Purpose in Health Care.’ doi:10.3233/978-1-
61499-921-8-55

Mussalo P., Hotti V., Kirjanen A., Lauronen H., Härkönen H., Huikari J., Holopainen J. (2018b). ‘Common controls driven conceptual
leadership framework.’ doi: https://doi.org/10.23996/fjhw.68821

Neuendorf K.A. (2002). ‘The Content Analysis Guidebook,’ SAGE Publications, pp.156
Ransbotham S., Kiron D. (2017). ‘Analytics as a Source of Business Innovation.’ MIT Sloan Management Review

Stanford University (2011). ‘Data Governance Maturity Model.’ http://web.stanford.edu/dept/pres-provost/cgi-bin/dg/wordpress/wp-


content/uploads/2011/11/StanfordDataGovernanceMaturityModel.pdf

Unified Compliance. ‘Compliance Dictionary.’ https://compliancedictionary.com/

Van Dijck J. (2014). ‘Datafication, dataism and dataveillance: Big data between scientific paradigm and ideology.’ Surveillance and
Society, 12(2):197–208

Walti B., Reschenhofer T., Matthes F. (2015). ‘Data Governance on EA Information Assets: Logical Reasoning for Derived Data.’
International Conference on Advanced Information Systems Engineering CAiSE 2015: Advanced Information Systems Engineering
Workshops, pp. 401–412

WordArt. ‘Word Cloud Art Creator.’ https://wordart.com/

Appendix A

We analysed awareness of processing stakeholders in the sanction-based articles, and our results are based on these parts of articles.
The content analyses of the GDPR articles are based on potential roles. We bolded subjects or a part of them if they contain some
potential roles. We underlined actions or a part of them if they concern some named stakeholders. We bolded and underlined objects
or a part of them if they contain some named stakeholders. Due to the limited amount of text available, here is represented the Points of
Articles that require human interpretation (i.e., <i>, <(e)>, <(i)>); indeed, all results are available by replicated with the cognitive service.
Footnotes or vertical lines || clarify analysed blocks. The content analysed articles, or parts of them are the followings:
• Lawfulness of processing (Article 6). “Processing shall be lawful only if and to the extent that at least one of the following applies:
(Paragraph 1 Points a to f) the data subject [least one of the following applies, i.e., if it does not apply <(e)>] has given consent to
the processing of his or her personal data for one or more specific purposes” (Paragraph 1, Point a).
• Processing of special categories of personal data (Article 9). “Processing of personal data revealing racial or ethnic origin,
political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data,
biometric data for the purpose of uniquely identifying a natural person [data subject], data concerning health or data
concerning a natural person's sex life or sexual orientation shall be prohibited” (Paragraph 1). There are 10 points when one
of them applies, then Paragraph 1 shall not apply, for example,
o “the data subject [when another of these points are applied] <(e)> has given explicit consent to the processing…” (Paragraph
2, Point a),
o “processing is necessary for the purposes of carrying out the obligations and exercising specific rights of the controller [when
another of these points are applied] <(i)> or of the data subject in the field of employment …” (Paragraph 2 Point b) or,
o “processing relates to personal data which are manifestly made public by the data subject” (Paragraph 2, Point e).
• Processing which does not require identification (Article 11). “If the purposes for which a controller processes personal data do
not or do no longer require the identification of a data subject by the controller, the controller shall not be obliged to maintain,
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
acquire or process additional information in order to identify the data subject for the sole purpose of complying with this
Regulation” (Paragraph 1). (T)he controller is able to demonstrate that it is not in a position to identify the data subject, the
controller<(e)> shall inform the data subject<(e)> accordingly, if possible” (Paragraph 2).
• Transparent information, communication and modalities for the exercise, the rights of the data subject (Article 12). “The controller
shall take appropriate measures to provide any information referred to in Articles 13 and 14 and any communication under
Articles 15 to 22 and 34 relating to processing to the data subject in a concise, transparent, intelligible and easily accessible
form, using clear and plain language, in particular for any information addressed specifically to a child. ... When requested
by the data subject, the information may be provided orally, …” (Paragraph 1). “The controller shall facilitate the exercise of
data subject rights under Articles 15 to 22. In the cases referred to in Article 11(2), the controller shall not refuse to act on the
request of the data subject for exercising his or her rights under Articles 15 to 22, | unless the controller<(e)> demonstrates that
it is not in a position to identify the data subject|” (Paragraph 2). “The controller shall provide information on action taken on
a request under Articles 15 to 22 to the data subject without undue delay …” (Paragraph 3). “If the controller does not take action
on the request of the data subject, the controller shall inform the data subject without delay and at the latest within one month
of receipt of the request of the reasons for not taking action…” (Paragraph 4). “Information provided … shall be provided free of
charge. |Where requests from a data subject<(e)> [if controller refuse] are manifestly unfounded or excessive, in particular
because of their repetitive character, the controller may either|” (Paragraph 5 Points a to b) “charge a reasonable fee…” (Point a) or
“refuse to act on the request” (Point b). “The controller shall bear the burden of demonstrating the manifestly unfounded or
excessive character of the request” (Paragraph 5). “Without prejudice to Article 11, where the controller<(e)> has reasonable
doubts concerning the identity of the natural person making the request referred to in Articles 15 to 21, the controller may
request the provision of additional information necessary to confirm the identity of the data subject” (Paragraph 6). “The
Commission shall be empowered to adopt delegated acts” (Paragraph 8).
• Information to be provided where personal data have not been obtained from the data subject (Article 14). “Where personal data
have not been obtained from the data subject, the controller shall provide the data subject with the following information:”
(Paragraph 1 Points from a to f) “{where applicable, that} the controller intends to transfer personal data to a recipient in a
third country or international organisation and the existence or absence of an adequacy decision by the Commission <(e)>,
or in the case of transfers referred to in Article 46 or 47, or the second subparagraph of Article 49(1), reference to the
appropriate or suitable safeguards and the means to obtain a copy of them or where they have been made available”(Point
f). “In addition to the information referred to in the Paragraph 1, the controller shall provide the data subject with the following
information necessary to ensure fair and transparent processing in respect of the data subject” (Paragraph 2 Points from a to f):
o “the existence of the right to request from the controller access to and rectification {or erasure of personal data or}
restriction of processing concerning the data subject and to object to processing as well as the right to data portability”
(Point c),
o “from which source the personal data originate” (Point f),
o “the existence of automated decision-making, including profiling, referred to in Article 22(1) and (4) and, at least in
those cases, meaningful information about the logic involved, as well as the significance and the envisaged consequences
of such processing for the data subject” (Point g).
“The controller shall provide the information referred to in the Paragraphs 1 and 2”, the Paragraph 3, Points a to c. “Where the
controller intends to further process the personal data for a purpose other than that for which the personal data were obtained, the
controller shall provide the data subject prior to that further processing with information on that other purpose and with any
relevant further information as referred to in Paragraph 2” (Paragraph 4). In contrast, the Paragraph 5 indicates that “Paragraphs 1,
to 4 shall not apply where and insofar as:” Points a to d.
o “The data subject already has the information” (Point a),
o “the provision of such information proves impossible [the controller <(e)> shall provide the data subject information] or would
involve a disproportionate effort, …. In such cases the controller shall take appropriate measures to protect the data subject's
rights [<(i)> the information publicly available] and freedoms and legitimate interests, including making the information
publicly available” (Point b),
o “obtaining or disclosure is expressly laid down by Union or Member State law to which the controller 3 is subject and
which [Union or Member State law] provides appropriate measures to protect the data subject's legitimate interests; or”
(Point c).
o “Where the personal data must remain confidential subject to an obligation of professional secrecy regulated by Union or
Member State law, including a statutory obligation of secrecy” (Point d).
• Right of access by the data subject (Article 15). “The data subject<(e)>[the right to obtain confirmation requires request ] shall
have the right to obtain from the controller confirmation as to whether or not personal data concerning him or her are being
processed, where that is the case, access to the personal data and the following information:”
• Right to rectification (Article 16) The data subject <(e)> [requires request] shall have the right to obtain from the controller<(e)>
without undue delay the rectification of inaccurate personal data concerning him or her. Taking into account the purposes of

3
Union or Member State law to which the controller is subject
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
the processing, the data subject shall have the right to have incomplete personal data completed, including by means of providing
a supplementary statement”.
• Right to erasure (Article 17). “|The data subject <(e)> [requires request] shall have the right to obtain from the controller the
erasure of personal data concerning him or her without undue delay| and |the controller shall have the obligation to erase
personal data without undue delay where one of the following grounds applies|” (Paragraph 1 Points a to f):
o the personal data have to be erased for compliance with a legal obligation in Union or Member State law to which
the controller is subject (Point e)
• Notification obligation regarding the rectification or erasure of personal data or restriction of processing (Article 19). "The
controller shall communicate any rectification or erasure of personal data or restriction of processing carried out in accordance
with Article 16, Article 17(1) and Article 18 to each recipient to whom the personal data have been disclosed, unless this proves
impossible [the controller <(e)> shall communicate] or involves disproportionate effort. |the controller shall inform the data
subject|<(e)> [requires request] about those recipients if the data subject requests it”.
• Right to data portability (Article 20 ) states “The data subject <(e)> [requires request] shall have the right to receive the personal
data concerning him or her, which he or she has provided to a controller, in a structured, commonly used and machine-
readable format and have the right to transmit those data to another controller without …, where” “the processing is based
on consent pursuant to point a of Article 6 Paragraph 1 or point a of Article 9 Paragraph 2 or on a contract pursuant to point ‘b’ of
Article 6 Paragraph 1 and the processing is carried out by automated means.” (Paragraph 1). The right referred to in paragraph 1
shall not adversely affect the rights and freedoms of others” (Paragraph 4).
• Automated individual decision-making, including profiling (Article 22) state in Paragraph 1 that “the data subject <(e)> [requires
request] shall have the right not to be subject to a decision based solely on automated processing, including profiling, which
produces legal effects concerning him or her or similarly significantly affects him or her.” However, “Paragraph 1 shall not
apply if the decision (Paragraph 2 points a to c):
o is authorised by Union or Member State law <(e)> to which the controller is subject, and which also lays down suitable
measures to safeguard the data subject's rights and freedoms and legitimate interests (Point b), or
• Joint controllers (Article 26), shall be: “where two or more controllers jointly determine the purposes and means of processing.”
Further, “they [joint controllers] shall in transparent manner determine their respective responsibilities for compliance with the
obligations under this Regulation, {in the particular}[they shall determine] exercising of the rights of the data subject and their
respective duties to provide the information referred to in articles 13 and 14, by means of the arrangement between of them,
unless, and in so far as, |the respective responsibilities of the controllers are determined by Union or Member State law to
which the controllers are subject|. The arrangement may designate a contact point for data subjects<(e)>” (Paragraph 1).
• Representatives of controllers or processors not established in the Union (Article 27) state that “where Article 3 Paragraph 2 applies,
the controller or processor shall designate in writing a representative in the Union” (Paragraph 1). [Article 3 Paragraph 2 is: this
Regulation applies to the processing of personal data of data subjects who are in the Union by a controller or processor not
established in the Union, where the processing activities are related to:
• “the offering of goods or services, …, to such data subjects in the Union (Point a) or
• the monitoring of their behaviour …within the Union (Point b).]
“The obligation laid down in Paragraph 1 of this Article shall not apply to:” (Paragraph 2 Points a to b). “processing which is
occasional, does not include, on a large scale, processing of special categories of data …, and is |[processing] unlikely to result in
a risk to the rights and freedoms of natural persons|[data subject]<(e)>, taking into account the nature, context, scope and
purposes of the processing; or” (Point a).
• Records of processing activities (Article 30). The Paragraph 1, Points a to g, [The record shall contain] “the name and contact
details of the controller<i> and, where applicable, the joint controller, the controller's representative and the data protection
officer<i>” (Point a), [The record shall contain] “ description of the categories of data subjects<i> and of the categories of
personal data” (Point c). Paragraph 4 “The controller or the processor and {, where applicable,} the controller's or the
processor's representative, shall make the record available to the supervisory authority <(e)> on request.” Further, “The
obligations referred to in Paragraphs 1 and 2 shall not apply to an enterprise <(e)> or an organisation employing fewer than 250
persons unless the processing it carries out is likely to result in a risk to the rights and freedoms of data subjects, the processing is
not occasional,” (Paragraph 5).
• Security of processing (Article 32). "The Paragraph 4 … unless he or she [natural person] is required to do {so} by Union or
Member State law” <(e)>.”
• Notification of a personal data breach to the supervisory authority (Article 33). Paragraph 3 “(t)he notification referred to in
Paragraph 1 shall at least:” Points a to d. [The notification shall] “communicate the name and contact details of the data protection
officer<i> or other contact point where more information can be obtained” (Point b).
• Communication of a personal data breach to the data subject Article 34. “When the personal data breach is likely to result in a
high risk to the rights and freedoms of natural persons, |the controller shall communicate the personal data breach to the data
subject without undue delay|” (Paragraph 1). “The communication to the data subject {referred to in Paragraph 1 of this Article}
shall describe in clear and plain language the nature of the personal data breach and contain at least the information and measures
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
referred to in points (b), (c) and (d) of Article 33(3)” (Paragraph 2). However, Paragraph 3 states that “the communication to the
data subject<(e)> referred to in Paragraph 1 shall not be required, if any of the following conditions are met:
o the controller has implemented appropriate technical and organizational protection measures… (Point a),
o the controller has taken subsequent measures which | [the controller has] ensure that the high risk to the rights and
freedoms of data subjects referred to in Paragraph 1 is no longer likely to materialize|(Point b),
o it would involve disproportionate effort. |In such a case, there shall instead be a public communication or similar measure
whereby the data subjects are informed in an equally effective manner|” (Point c)
“If the controller has not communicated the personal data breach to the data subject, the supervisory authority {, having
considered the likelihood of the personal data breach resulting in a high risk,} may require it[controller] to do so or [the
supervisory authority] may decide that any of the conditions referred to in Paragraph 3 are met” (Paragraph 4).
• Data Protection Impact Assessment (Article 35), (DPIA) is requested from the controller prior to the processing. “Where
appropriate, the controller shall seek the views of data subjects <(e)> [where appropriate] or their representatives on the
intended processing, …” (Paragraph 9).
• Data protection officer (Article 37). Paragraph 3 “Where the controller or the processor is a public authority or body, |a single
data protection officer may be designated for several such authorities or bodies, taking account of their .. and size|”. In addition,
“(i)n cases other than those referred to in Paragraph 1, the controller or processor or associations and other bodies representing
categories of controllers or processors | [the controller or processor] may {or, where} required by Union or Member State
law 4<(e)> shall, designate a data protection 5 officer|. “The data protection officer<(e)> may be a staff member of the controller
or processor or fulfil the tasks on the basis of a service contract” (Paragraph 6).
• Tasks of the data protection officer (Article 39). “The data protection officer shall have at least the following tasks: (Paragraph 1
Points a to c)”. “to monitor compliance with this Regulation… and training of staff <i> involved in processing operations, and
the related audits” (Point b).
• Monitoring of the approved codes of conduct (Article 41). “{Without prejudice …} the competent supervisory authority and
{…,} a body {…} shall {, …,} take appropriate action |in cases of infringement of the code by a controller or processor, including
suspension or exclusion of the controller<(i)> or processor concerned from the code|. It shall inform the competent
supervisory authority of such actions and the reasons for taking them” (Paragraph 4).
• Certification (Article 42). “In addition, …, data protection certification mechanisms, seals or marks approved … may be established
for the purpose of demonstrating the existence of appropriate safeguards provided by controllers or processors that are not subject
to this Regulation pursuant to Article 3 within the framework of personal data transfers to third countries or international
organisations under the terms referred to in point (f) of Article 46(2). Such controllers or processors shall make binding and
enforceable commitments, {via contractual or other legally binding instruments,} to apply those appropriate safeguards, including
with regard to the rights of data subjects<(e)> [ may be established].” (Paragraph 2).
• Certification bodies (Article 43) The Paragraph 2 “Certification bodies referred to in Paragraph 1 shall be accredited in accordance
with that Paragraph only where [Certification bodies] they have:” (Points a to e).
o “demonstrated their independence and expertise in relation to the subject-matter of the certification to the satisfaction
of the competent supervisory authority” (Point a),
o [Certification bodies have] “{... and} approved by the supervisory authority …or by the Board ...” (Point b),
o [Certification bodies have] “{established procedures and structures to handle complaints about infringements of the
certification or the manner in which} the certification {has been, or} is being, implemented by the controller or processor,
{and} to make those procedures and structures transparent to data subjects <i> and the public; and” (Point d),
o [Certification bodies have] “demonstrated, to the satisfaction of the competent supervisory authority, that their tasks and
duties do not result in a conflict of interests” (Point e).
“The Commission <(e)> may adopt implementing acts laying down technical standards for certification mechanisms and data
protection seals and marks…” (Paragraph 9).
• Transfers subject to appropriate safeguards (Article 46). “The appropriate safeguards referred to in Paragraph 1 may be provided
for, without requiring any specific authorisation from a supervisory authority, by:” (Paragraph 2 Points a to f),
[The appropriate safeguards may be provided by]
o “a legally binding and enforceable instrument between public authorities or bodies” <(e)> (Point a),
o “standard data protection clauses adopted by the Commission in accordance with the examination procedure” (Point c),
o ” standard data protection clauses adopted by a supervisory authority and approved by the Commission” (Point d),
o “an approved code of conduct pursuant to Article 40 together with binding and enforceable commitments of the controller or
processor in the third country to apply the appropriate safeguards, including as regards data subjects' rights, or” (Point e),
”an approved certification mechanism … together with binding and enforceable commitments of the controller or processor in
the third country to apply the appropriate safeguards, including as regards data subjects' rights” (Point f).

4
the controller or processor required by Union or Member State law shall designate a data protection officer.
5
the controller or processor may designate a data protection officer
Inderscience retains the copyright of the article, Electronic government, DOI 10.1504/EG.2021.10034597
• Binding corporate rules (Article 47). Paragraph 1, “the competent supervisory authority shall approve binding corporate rules
accordance with the consistency mechanism set out in Article 63, provided that they:”
(Points a to c)
o “[corporate rules] are legally binding and apply to and are enforced by every member concerned of the group of
undertakings {,} or group of enterprises engaged in a joint economic activity {,} including their employees” (Point a)
“The binding corporate rules shall specify at least,” Paragraph 2 points a to n:
[The binding corporate rules shall specify]
o “the mechanisms within the group of undertakings, … for ensuring the verification of compliance with the binding corporate
rules. Such mechanisms shall include …, and should be available upon request to the competent supervisory authority”
<(e)> (Point j),
• Transfers or disclosures not authorised by Union law (Article 48). “Any judgment of a court or tribunal and any decision of an
administrative authority of a third country requiring a controller<(i)> or processor<(i)> to transfer or disclose personal data may
only be recognised or enforceable in any manner if based on an international agreement, such as a mutual legal assistance treaty, in
force between the requesting third country and the Union or a Member State, without prejudice to other grounds for transfer pursuant
to this Chapter.”
• Powers (Article 58) “Each supervisory authority shall have all of the following corrective powers:” (Paragraph 2 Points a to j)
o [Each supervisory authority shall have] to order the controller or the processor to comply with the data subject's requests
<(e)> [require order] to exercise his or her rights pursuant to this Regulation (Point c)
• Processing of the national identification number (Article 87). “Member States <(e)> may further determine the specific conditions
for the processing of the national identification number or any other identifier of general application shall be used only under
appropriate safeguards for the rights and freedoms of the data subject pursuant to this Regulation.”
• Processing in the context of employment (Article 88). “Member States <(e)> may, by law or by collective agreements, provide for
more specific rules to ensure the protection of the rights and freedoms in respect of the processing of employees'<(i)>
personal data in the employment context, …, and for the purpose of the termination of the employment relationship” (Paragraph
1). “Those rules shall include suitable and specific measures to safeguard the data subject's human<(i)> dignity, legitimate
interests and fundamental rights, with particular regard to the transparency of processing, the transfer of personal data
within a group of undertakings<(i)>, or a group of enterprises<(i)> engaged in a joint economic activity and monitoring
systems at the work place.” (Paragraph 2).
• Safeguards and derogations relating to processing for archiving purposes in the public interest, scientific or historical research
purposes or statistical purposes (Article 89). “Processing for archiving purposes in the public interest, scientific or historical
research purposes or statistical purposes, [Processing purposes] shall be subject to appropriate safeguards, in accordance with
this Regulation, for the rights and freedoms of the data subject. Those safeguards shall ensure that technical and organisational
measures are in place [“controller <i> and the processor<i> shall implement appropriate technical and organisational measures”
(Article 32 Paragraph 1)] in particular in order to ensure respect for the principle of data minimisation. …. Where those purposes can
be fulfilled by further processing which does not permit or no longer permits the identification of data subjects, those purposes
shall be fulfilled in that manner” (Paragraph 1). “Where personal data are processed for scientific or historical research purposes or
statistical purposes, Union or Member State law <(e)> may provide for derogations from the rights ..., and such derogations are
necessary for the fulfilment of those purposes.” (Paragraph 2). “Where personal data are processed for archiving purposes in the
public interest, Union or Member State law<(e)> may provide for derogations from the rights … and such derogations are
necessary for the fulfilment of those purposes.” (Paragraph 3).
• Obligations of secrecy (Article 90), “Member States <(e)> may adopt specific rules to set out the powers of the supervisory
authorities laid down in points (e) and (f) of Article 58(1) in relation to controllers <(e)> or processors <(e)> that are subject,
under Union or Member State law or rules established by national competent bodies, to an obligation of professional secrecy
or other equivalent obligations of secrecy where this is necessary and proportionate to reconcile the right of the protection
of personal data with the obligation of secrecy. ” (Paragraph 1).
Paper VIII
Authors: Gain U, Hotti V.

Year: 2021.

Article title: Low-code autoML-augmented Data Pipeline – A Review and Experiments.

Journal: Journal of Physics: Conference Series 1828012015.

Publisher: IOP Publishing Ltd

Permissions from co-authors via email: Hotti Virpi received 9.11.2021 at 7:57

Copyright: This article is published open access under a Creative Commons Attribution license CC BY
http://creativecommons.org/licenses/by/3.0/ http://dx.doi.org/10.1088/1742-6596/1828/1/012015
ISAIC 2020 IOP Publishing
Journal of Physics: Conference Series 1828 (2021) 012015 doi:10.1088/1742-6596/1828/1/012015

Low-code AutoML-augmented Data Pipeline – A Review and


Experiments

Ulla Gain* and Virpi Hotti


School of Computing, University of Eastern Finland, Finland
*Email: [email protected]; [email protected]

Abstract. There is a lack of knowledge concerning the low-code autoML (automated machine
learning) frameworks that can be used to enrich data for several purposes concerning either data
engineering or software engineering. In this paper, 34 autoML frameworks have been reviewed
based on the latest commits and augmentation properties of their GitHub content. The PyCaret
framework was the result of the review due to requirements concerning adaptability by Google
Colaboratory (Colab) and the BI (business intelligence) tool. Finally, the low-code autoML-
augmented data pipeline from raw data to dashboards and low-code apps has been drawn based
on the experiments concerned classifications of the "Census Income" dataset. The constructed
pipeline preferred the same data to be a ground for different reports, dashboards, and applications.
However, the constructed low-code autoML-augmented data pipeline contains changeable
building blocks such as libraries and visualisations.

1. Introduction
Common data models and common data lakes are examples to promote both low-code application
development and AI-BI-insights generation. However, both individuals and organisations build their
data models and data storages, or at least they make experiments concerning data pipelines from raw
data into insights. At the same time, there is pressure to predict and even prevent the behaviour of things
such as customers.
There are several machine learning (ML) frameworks that can be adapted to build and deploy state-
of-the-art machine learning models for predictions and detections as well as consume the models for
unseen data. However, there is a lack of knowledge concerning the low-code autoML frameworks that
can be used to generate models with minimum setting up parameters as well as to automate data
processing workflows (a.k.a., pipelines). There are some open-source Python packages for pipeline
development. However, some autoML frameworks orchestrate the entire pipeline from data preparations
into adaptable models. Those frameworks automate, for example, missing value imputations, categorical
data transformations, and hyperparameter tunings.
The review is based on free and open-source software (FOSS) ML frameworks [1]. First, we research
whether there are the autoML wrappers of the state-of-the-art methods to do either supervised or
unsupervised learning on tabular data. Our research questions are as follows: Are selected ML
frameworks committed during the last three years? Whether they are, then we research possibilities to
use the framework to label classes, clusters and outliers as well as pre-calculate continuous values for
tabular data. Finally, we will figure out is the autoML framework based on Python because we have
deployment requirements concerning Google Colaboratory and the Microsoft Power BI tool. Google
Collaboratory can be used to upstreaming data, and several models can be generated and evaluated

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd 1
ISAIC 2020 IOP Publishing
Journal of Physics: Conference Series 1828 (2021) 012015 doi:10.1088/1742-6596/1828/1/012015

without feature engineering, such as transforming categorical values into numerical ones. Moreover, the
best ML model shall be runnable in the Python script of the BI tool such as Microsoft Power BI. The
main reason for that is to provide insights-driven data pipelines where data is ingested, unified, and
mastered, as well as analysed and enriched to provide a ground for reports, dashboards, and applications.

2. Review Results
Twenty-nine frameworks seem to be under construction based on the latest commits and augmentation
properties of their GitHub content (Table 1).

Table 1. The autoML frameworks and the latest commits (15.10.2020).


Framework GitHub 2018 2019 2020
Acme https://github.com/deepmind/acme x
AdaNet https://github.com/tensorflow/adanet x
Analytics Zoo https://github.com/intel-analytics/analytics-zoo x
auto_ml https://github.com/ClimbsRocks/auto_ml x
Blocks https://github.com/mila-udem/blocks x
Detectron2 https://github.com/facebookresearch/detectron2 x
Dopamine https://github.com/google/dopamine x
Fastai https://github.com/fastai/fastai/ x
Featuretools https://github.com/Featuretools/featuretools x
FlyingSquid https://github.com/HazyResearch/flyingsquid x
Karate Club https://github.com/benedekrozemberczki/karatecluB x
Keras https://github.com/keras-team/keras x
learn2learn https://github.com/learnables/learn2learn/ x
Lore https://github.com/instacart/lore x
Mljar https://github.com/mljar/mljar-supervised x
MLsquare https://github.com/mlsquare/mlsquare x
NeuralStructuredLearning https://git hub.com/tensorflow/neural-structured-learning x
NNI https://github.com/Microsoft/nni x
NuPIC https://github.com/benedekrozemberczki/karatecluB x
Plato https://github.com /uber-research/plato-research-dialogue-system x
Polyaxon https://github.com/polyaxon/polyaxon x
PyCaret https://github.com/pycaret/pycaret x
Pyro https://github.com/uber/pyro x
Pythia https://github.com/facebookresearch/pythia x
PyTorch https://github.com/pytorch/pytorch x
ReAgent https://github.com/facebookresearch/ReAgent x
RLCard https://github.com/datamllab/rlcard x
Scikit-learn https://github.com/scikit-learn/scikit-learn x
Streamlit https://github.com/streamlit/streamlit x
TF Encrypted https://github.com/tf-encrypted/tf-encrypted x
Theano https://github.com/Theano/Theano x
Thinc https://github.com/explosion/thinc x
Turi & TuriCreate https://github.com/apple/turicreate x
XAI https://github.com/EthicalML/xai x

2
ISAIC 2020 IOP Publishing
Journal of Physics: Conference Series 1828 (2021) 012015 doi:10.1088/1742-6596/1828/1/012015

There are only two frameworks (Table 2) that can be used to label classes, clusters and outliers as
well as pre-calculate continuous values for tabular data. Observe that other augmentation purposes are
not predefined due to vague descriptions of the frameworks. However, the PyCaret framework is only
one low-code machine learning library that can be used by Google Colaboratory and Microsoft Power
BI [2].

Table 2. Appraised autoML frameworks and their augmentations (outliers, clusters, classes, and pre-
calculations) for tabular data and other purposes.
Framework Pre- Other purposes
Clusters
Outliers

Classes
calculations

Acme reinforcement learning


AdaNet x
Analytics Zoo time-series computer vision, NLP, recommendation
Detectron2 object detection
Dopamine reinforcement learning
Fastai x image classification, image segmentation, text-based
sentiments, recommendation
Featuretools automate feature engineering
FlyingSquid labelling
Karate Club unsupervised learning on graph-structured data
learn2learn meta-learning
Lore standardise ML techniques across multiple libraries
Mljar x regression machine-learning pipelines
MLsquare x recommendation
MMF (fka Pythia) vision and language modelling
NeuralStructuredLearning image classification
NNI manages AutoML experiments
NuPIC unsupervised learning on graph-structured data
Plato conversational AI agents
Polyaxon container-native engine for running machine learning pipelines
PyCaret x x x regression, association mining, NLP, machine learning pipelines
time series
Pyro deep probabilistic modelling
PyTorch provide tensor routines
ReAgent an end-to-end platform for applied reinforcement learning (RL)
developed
RLCard toolkit for reinforcement learning in card games
Scikit-learn x x x regression, pre-processing, model selection, dimensionality reduction
time-series
Streamlit to create apps for machine learning projects
TF Encrypted enable training and prediction over encrypted data
Theano define, optimise, and evaluate mathematical expressions
involving multi-dimensional arrays
Thinc composing models
Turi & TuriCreate recommendations, object detection, image classification, image
similarity

3
ISAIC 2020 IOP Publishing
Journal of Physics: Conference Series 1828 (2021) 012015 doi:10.1088/1742-6596/1828/1/012015

3. Experiment-based Low-code AutoML-augmented Data Pipeline


The "Census Income" dataset (https://archive.ics.uci.edu/ml/machine-learning-
databases/adult/adult.data) columns are "Age", "Workclass", "fnlwgt", "Education", "Education-Num",
"Marital-Status", "Occupation", "Relationship", "Race", "Sex", "Capital-Gain", "Capital-Loss", "Hours-
per-week", "Country", and "Over-50K". The PyCaret framework contains 18 classification models that
are used to construct models when we figure out features that can be used to predict whether the yearly
incomes are more or less than 50K (i.e., target='Over-50K'). The PyCaret framework highlights
preferred models based on several metrics (Figure 1). Further, the PyCaret framework illustrates the
most important features. The classified labels can be used, for example, as slicers in a report (Figure 2)
to get a deeper understanding concerning meaningful fields.

Figure 1. Classifiers and comparable metrics, the yellow coloured of which refer the best ones and
the stacked bar illustrate meaningful features. Observe that the underscore ("_") delimit separates the
feature name from the feature value when the feature type is categorical.

Figure 2. The classified labels as slicers for the meaningful fields.

We made experiments where Google Collaboratory (Colab) is used as a sandbox for the experiments
of the PyCaret framework. Microsoft Power Platform that contains Power BI and Power Apps is used
to provide reports, dashboards, and low-code applications. In our experiment, two Python scripts have
been run in the Microsoft Power BI query editor – one to build the classifier and another to classify the
dataset [3]. The autoML-augmented data pipeline (Figure 3) from raw data to dashboards and low-code
apps has been drawn based on the experiments such as classifications of the "Census Income" dataset.

4
ISAIC 2020 IOP Publishing
Journal of Physics: Conference Series 1828 (2021) 012015 doi:10.1088/1742-6596/1828/1/012015

Figure 3. Example of the low-code autoML-augmented data pipeline.

The meanings of the numbered pipeline items are the following:


1) Curated or unmanifestable data. When new technologies such as low-code autoML wrappers
are evaluated, then the known datasets have been used because they are curated datasets the
insights of which have been verified. However, data are usually unmanifestable, and we will
figure out both attributes and instances as well as what data will tell us.
2) Explorable or learnable data. CO stands for Google Collaboratory (Colab), and it is easy to
ingest data to explore or build models from it. Nowadays, plenty of notebooks are used to make
experiments concerning new technologies such as the low-code autoML wrappers. Moreover,
the Colab-liked environments facilitate setups concerning the development environments
because there are pre-installed packages and new installations are easy to make. The icons in
our example (Figure 3) are the followings: a 'tree climbing bear' stands for the Pandas library
the functions of which serve both data framing and exploratory analysis, a cube with N stands
for Numpy, colour bars stand for SHAP (SHapley Additive exPlanations), and a rocket stands
for the PyCaret library the functions of which serve both modelling and pipelining. Two other
icons, a cube with N for Numpy and colour bars for SHAP (SHapley Additive exPlanations),
are examples of the libraries that are used by PyCaret.
3) Schema-relatable or stand-alone data. When data seem to be valid for further processing, then
some transformations might be made to prepare data either to follow the selected schema (e.g.,
Microsoft Common Data Model) or to get insights from data without the predefined scheme.
4) Model files or best model identifiers (id). Power BI (a yellow column image) offers possibilities
to run Python scripts at the table level. It is possible to run model files that are produced by
other tools. However, we have some interoperability problems when we tried to run the model
files, and Power BI is not flexible to make experiments and comparisons between the models.
Therefore, the best model identifier (id) is used to create a model and enrich the tabular data.
5) Dashboards and reports. Power BI (a yellow column image) offers several visualisation
possibilities such as quick insights and visual-based analyses as well as even AI visuals the
names of which are "Key influencers" and "Decomposition tree". A funnel image stands for a

5
ISAIC 2020 IOP Publishing
Journal of Physics: Conference Series 1828 (2021) 012015 doi:10.1088/1742-6596/1828/1/012015

slicer within the AI visuals to give insights from the data. Further, we formed tiles from the
dashboards for the low-code applications.
6) Low-code applications. Power Apps (diamonds inside the shape) contains standards entities
such as Account that are deployable. We merged data tables into one Excel file in our
experiments, and then we create a canvas app from the file. Further, we used tiles of the
dashboards in low-code applications and the low-code applications as parts of the reports and
dashboards.
The power of the low-level autoML framework is mainly in predefined parameters concerning model
setup. The PyCaret framework (i.e., the Python package) requires two mandatory parameters (data and
target column) for setup models and the rest of the setup parameters are either ML task (e.g.,
classification or regression) specific or common ones. We did not report the effects of the changes
concerning the setup parameters. However, our guideline of the setup parameters is based on the
following two groups where the verb in the list identifier serves as memory support to perceive the main
function of the parameter:
• Feature collection
o Reduce. pca = False, pca_method = 'linear' | 'kernel' | 'incremental', pca_components = None
o Bin. bin_numeric_features = None
o Group. group_features = None, group_names = None
o Ignore. ignore_features = None
o Permutate. feature_selection = False, feature_selection_method = = 'classic' | 'boruta',
feature_selection_threshold = 0.8
o Drop. remove_multicollinearity = False, multicollinearity_threshold = 0.9
o Combine. feature_interaction = False, interaction_threshold = 0.01, feature_ratio = False
o Relate. polynomial_features = False, polynomial_degree = 2, polynomial_threshold = 0.1;
trigonometry_features = False
o Detect. remove_outliers = False, outliers_threshold = 0.05
o Cluster. create_clusters = False, cluster_iter = 20
• Feature values
o Impute. categorical_imputation = 'constant', numeric_imputation = 'mean'
o Type. categorical_features = None, numeric_features = None, date_features = None,
ordinal_features = None
o Encode. high_cardinality_features = None, high_cardinality_method = 'frequency' |
'clustering'
o Unwant. combine_rare_levels = False, rare_level_threshold = 0.10
o Rescale. normalize = False, normalize_method = 'zscore' | 'minmax' | 'maxabs' | 'robust'
o Reshape. transformation = False, transformation_method = 'yeo-johnson' | 'quantile'
o Retarget. transform_target = False, transform_target_method = 'box-cox' | 'yeo-johnson'
o Replace. ignore_low_variance = False

4. Conclusions
There are several ML tasks, and they can be grouped in several ways [1,4]. When we compared the
autoML frameworks with the repository containing a curated list of ML libraries [4], we realised that
PyCaret is categorised to handle "Model Training Orchestration". However, some libraries (a.k.a.,
frameworks) such as TPOT and ktrain include autoML frameworks, but we cannot use them in the
Power BI pipelines.
Business use cases within outcome-centric descriptions of the low-code machine learning libraries
(or wrappers such as PyCaret) are essential to increase the awareness of ML-based augmentations such
as outliers, clusters, and classes. Lack of understanding of the algorithms and setup parameters are
pitfalls when we adapt the functionalities of the wrappers. However, the autoML insights at least raise
questions and the awareness of the autoML possibilities, especially when business users can use them
in BI tools without pressure concerning details of multiple algorithms.

6
ISAIC 2020 IOP Publishing
Journal of Physics: Conference Series 1828 (2021) 012015 doi:10.1088/1742-6596/1828/1/012015

There are ten related studies of 14 hits that have been found from Google Scholar within the search
phrase autoML+low-code. However, these studies do not overlap with ours. There was one low-code
library for augmented machine learning called ktrain [5] that have been used to classify texts and images
as well as to build an end-to-end QA system. However, some discussion concerning low-code
development practices has been highlighted in the context of the sentiment analysis [6]. Low-code cases
can be perceived as part of the AI context. Therefore, the lesson learned from AI functionality in
enterprise contexts has been presented [7] as well as challenges concerning automated workflows that
conduct embedded ML in Business Process Management Software (BPMS) [8]. In general, autoML and
low-code platforms are the implementations of AI [9], or AI is used to empower something such as to
assess and manage critical issues of performance and stability of the applications [10]. There are several
skill requirements concerning AI tasks [11] as well as open-source tools and commercial ones (e.g.,
Amazon machine learning) [12]. In general, open-source technologies [13] and fairness in ML [14] are
meaningful in low-code autoML development.
Nowadays, the same data is preferred to be a ground for different reports, dashboards, and
applications. Data engineering and software engineering disciplines are quite near each other. The low-
code autoML frameworks (or wrappers) that are usable in the BI tools give a possibility to augment
understanding concerning, for example, classes for tabular data. In general, the low-code autoML
frameworks are cognitive supportive when they manifest insights from datasets.

References
[1] Mardjan M 2020 Free and Open Machine Learning Release 1.0.1 Available at
https://readthedocs.org/projects/freeandopenmachinelearning/downloads/pdf/latest/
[2] Moez A 2020 Machine Learning in Power BI using Pycaret. Towards data science. Available at
https://towardsdatascience.com/machine-learning-in-power-bi-using-pycaret-34307f09394a
[3] PyCaret 2020 Available at https://pycaret.readthedocs.io/en/latest/
[4] The Institute for Ethical AI & Machine Learning 2020 Available at
https://github.com/EthicalML/awesome-production-machine-learning
[5] Maiya A S 2020 ktrain: A low-code library for augmented machine learning Preprint
arXiv:2004.10703
[6] Carvalho A and Harris L 2020 Off-the-shelf technologies for sentiment analysis of social media
data: Two empirical studies AMCIS 2020 Proceedings pp 1–10
[7] Casati F, Govindarajan K, Jayaraman B, Thakur A, Palapudi S, Karakusoglu F, Chatterjee D 1999
Operating Enterprise AI as a Service International Conference on Service-Oriented Computing pp
331–344
[8] Thakur A, Palapudi S, Karakusoglu F, Chatterjee D 2019 Operating Enterprise AI as a Service
Service-Oriented Computing: 17th International Conference, ICSOC 11895
[9] Taulli T 2019 Implementation of AI In:Artificial Intelligence Basics (Apress Berkeley CA) pp 143–
159 https://doi.org/10.1007/978-1-4842-5028-0_8
[10] Taulli T 2020 RPA Vendors The Robotic Process Automation Handbook pp 217–258
[11] Dibia V, Cox A, Weisz J 2018 Designing for democratisation: introducing novices to artificial
intelligence via maker kits Preprint arXiv:1805.10723
[12] Sakhnyuk P A and Sakhnyuk T I 2020 Intellectual Technologies in Digital Transformation. IOP
Conference Series: Materials Science and Engineering 873 1
[13] Atwal H 2020 DataOps Technology In:Practical DataOps (Apress, Berkeley, CA) pp 215–247
https://doi.org/10.1007/978-1-4842-5104-1_9
[14] Caton S and Haas C 2020 Fairness in machine learning: a survey. Preprint arXiv:2010.04053

7
Paper IX
Author: Gain U.

Year: 2021.

Article title: Applying Frameworks for Cognitive Services in IIoT.

Journal: Journal of Systems Science and Systems Engineering 30: 59-84.

Publisher: Springer Nature, DOI: 10.1007/s11518-021-5480-x

The final publication is available at link.springer.com

https://link.springer.com/article/10.1007/s11518-021-5480-x

Reproduced with permission in this thesis Copyright Transfer Statement, Springer:


J SYST SCI SYST ENG
Vol. 0, No. 0, 0, pp. 1–26 ISSN: 1004-3756 (paper), 1861-9576 (online)
DOI: To be added later CN 11-2983/N

Applying Frameworks for Cognitive Services in IIoT


Ulla Gain
School of Computing, University of Eastern Finland, FI-70211 Kuopio, Finland
[email protected] (::B)

Abstract. The technology, which enables creating new types of products, processes and services (i.e., things),
which outcomes alter traditional competition and industry boundaries and create new lasting value. The
digitalization process uses digital technologies to provide the possibilities of new revenue and value-
producing, i.e., it changes business models and offers new value propositions. This change is ongoing.
The most important ten strategic technology trends in 2019 include Edge computing, Blockchain, event and
data-driven strategies, Digital Twins, and the maintenance of transparency (i.e., traceability), Intelligent
Apps and Analytics (Gartner, 2018). In this paper, we experiment with the capabilities of intelligent
applications to match the industrial business needs. This paper aims to bring insights closer to business
objectives. Digitalization’s technological advantages can be achieved through data-driven strategies and
wherein cognitive services are integrated into IoT (Internet of Things) and big data. We experiment with the
Industrial IoT (IIoT) business models and value propositions to match the intelligent insights of cognitive
solutions to business objectives. The IIoTs support and demand transparency and thus also data-driven
objective insights, and because cognitive solutions can enhance insights on a product, a process or service
and therefore provide measurable business objectives. Functional indicators enable interconnected smart
things to collaborate.
Keywords: Digitalization, industrial internet of things, cognitive services, digital twin, blockchain.

1. Introduction ing’ against our objectives (Kaushik and Sybex


‘Digitalization’ has become a central topic of 2009) ; they are useful since they can be fur-
discussion in the industry. This trend seeks to ther used to indicate needed actions.A very
convert data into a resource that is analyzed large amount of unstructured, structured in-
and further processed into value-generating formation and signals go through the organi-
form. It begins with the observations of zation’s information processes, only some of
IIoT, gathering and connecting data to intel- that information is exploited. The automa-
ligent systems, and linking automated ser- tion capabilities of the actionable information
vices to create an interactive or automated are in the focus when we explore possibili-
consolidated symbiosis with humans. Ad- ties of cognitive services integration with IoT
vanced technology is sought for competitive ecosystems and big data in a cloud comput-
advantage. The traditional information sys- ing platform. Technological exploration sup-
tems are no longer enough in an international ports exploitation; it offers exploitation abili-
market where competitors offer comprehen- ties to automate cognitive processes, minimize
sive digital data services to their stakehold- the subjective bias, and integrate knowledge
ers.Competitiveness and value-added are re- into a learning environment.The information
lated to business goals. The goals are mea- that triggers the action is meaningful because
sured, according to the target values of factors. it does not have an absolute value unless it is
Together they constitute the process goal for actionable, and its results are measurable (Few
each process in focus. Indicators are metrics 2015). The data-based strategies should be
that can help us understand ‘how we are do- anchored on the objective insights (i.e., trans-

© Systems Engineering Society of China and Springer-Verlag GmbH Germany 2020 1


2 Ulla Gain: Applying Frameworks for Cognitive Services in IIoT

parent, augmented and actionable informa- ple of a framework to achieve measurable ad-
tion), which underpins data-based models of vantages in digitalization.This paper aims to
the process (i.e., evidence-based approach). bring objective insights closer to business ob-
For example, IIoT aims to extend an organi- jectives. Digitalization’s technological advan-
zation’s competitive advantage through exist- tages can be reached with data-driven strate-
ing processes, services, or/and product im- gies and wherein cognitive services are inte-
provement (exploitation). Also, the search of grated into IoT and big data. Three (i.e., (BMC,
the data-driven insights (exploration) by com- Lean and value proposition)) canvases were
bining several IoT to knowledge retrieval and selected to experiment with the ability of in-
cognitive services in the industry.Value-added telligent applications (i.e., cognitive services)
can be achieved when the processes exploit to meet the needs of the product, process and
the cognitive services’ potential to unlock hu- service ambidexterity (new smart object devel-
man expertise for the more demanding tasks. opment (i.e., exploration) and existing smart
Systems that require interdependence between object improvement (i.e., exploitation) in the
human and advanced technological solutions, industrial business. Furthermore, we present
(e.g., what has already been exploited in dig- an example of cognitive services’ ability to bind
italized health care systems) that analyzes a the knowledge of service and business objec-
vast amount of data and provides the treat- tive into the production line in an industrial
ment options with justification (Kelly 2015).In environment. In this paper, digitalization is in-
practice, very often in the projects of the dig- troduced as an example of a technology stack
italization attention is paid to the disruptive of digitalized Industry 4.0 Figure 4. The stack
technology, whereby the focus is going away of technology is constructed based on our ex-
from the objective targets, from measurements ample, Section 5. After that, the needed op-
and the need of the customer (Scott 2015); erational capabilities have been supplemented
moreover, the technical complexity assessment by the literature review of the technologies
is overlooked (Miller 2001); (Gartner 2017). background.There are many articles about IoT,
Harnessing information as a resource requires fewer when it is combined with cognitive ser-
technical knowledge, the process, product or vices and digital twins. Scopus’s used search
service know-how, and competent decision- has found three hits, two articles and one Con-
making in the digitalization project. A project ference Proceeding, tabulated papers in Figure
can confront many challenges such as a digi- 1. The keywords and the words of the abstracts
tal gap, resulting from slow adoption, lack of of these articles are presented in word-clouds
skills, barriers to entry, and organizational am- Figure 1. Used search ("digital twin" OR "dig-
bidexterity (Figure 6). The technological ad- ital twins") AND ("cognitive computing" OR
vantages of smart things consist of a confus- "cognitive service") AND ("internet of things"
ing amount of technical choices. Therefore, OR "IoT"). EBSCO’s search has been built in
this paper aims to crystallize the traceability (digital twin or digital twins AND internet of
in the form of transparency, bringing the tech- things or IoT or smart objects or ubiquitous
nological advantages of digitalization closer to computing or Blockchain or smart contracts or
the business objectives. Business canvases are cloud or cloud computing AND cognitive com-
presented in an industry 4.0 environment. Be- puting) it gives many articles (’130 470 hits’).
sides, to serve the needs of stakeholders, the The search has been limited by peer-reviewed
value propositions for industry 4.0 are pre- Academic journals where publishing date is
sented. Thus, the paper provides an exam- from 1.1.1998 to 31.12.2021 gives the remain 5
Ulla Gain: Applying Frameworks for Cognitive Services in IIoT 3

938 hits (review date 16.12.2020). Found pa- 1.1 Research methods
pers are presented in figures as follows: The In this paper has been used qualitative re-
fifty-first journals that published most of these search methods that underpin on the research
articles and the articles’ publishers are pre- of Glass et al. (2004), that explains, the qualita-
sented in Figure 2. Further, the subject The- tive methods in researches of information sys-
saurus Terms are presented with a word cloud tems that explore specifically usage/operation
to give the idea of typical topics Figure 3. Our and technology transfer, and use evaluative
paper adaption in the industrial digitalization research approaches. Furthermore, the used
brings objective insights closer to the business methods include the commonly used content
objectives – IoT is combined with, cognitive ser- analysis in software engineering (DeFranco
vices, digital twin and value propositions for and Laplante 2017).
Industry 4.0.In the following sections, we de-
1. Evaluation of the BMC, Lean and value
scribe the example of the technologies that con-
proposition canvases for smart object de-
struct functional assembly in the environment
velopment and improvement in the in-
of industry 4.0. Section 1.1 explains used re-
dustrial business. The modifications
search methods. Section 2 explains the mean-
have been done simultaneously with,
ing of smart things and Industry 4.0 in gen-
eral. After that, communication and connec- 2. The literature review of new technolo-
tions are explained, which has an important gies, smart things, Industry 4.0, based
role in processing (specifically in time-critical functional technology stack construc-
applications). Since industry 4.0 processing is tion.
circumstances specific, and very often, more
• the production line example of cog-
information is needed before the support or
nitive services, based technology
decisions can be made; therefore, the Digital
stack,
Twin and the knowledge base aspect are ex-
• operational capabilities supple-
plained.Finding the advantages of digitaliza-
ments by the literature review,
tion requires that digitalization need to under-
and
pin an organization’s strategies. Therefore, ex-
ploration and exploitation are explained with 3. Value proposition canvases are modified
the concepts of business canvases, and further to correspond to new technologies’ re-
in an environment of the industrial 4.0 (Section quirements and find the indicators and
2.4). To search and find objective insights, we factors that need improvement.
construct the value proposition canvas for In-
Canvases are selected as follows: firstly, can-
dustry 4.0. Further, the characteristics of cog-
vas selection is based on awareness, i.e., BMC,
nitive services are explored at the BMC, Lean
Lean and value propositions are well-known in
and value proposition canvas. Finally, we give
the industry. Secondly, new technologies chal-
an example to bring objective insights closer to
lenge organizational ambidexterity. Therefore
business objectives. In this example, digitalisa-
both business canvases are introduced, mean-
tion’s technological advantages can be reached
ing BMC and Lean. Thirdly, new technology
with data-driven strategies and wherein cog-
or innovations modify the competitive ways,
nitive services are integrated into smart things
but the models’ base remains; therefore, the
(Section 5).
experiment changes the naming convention to
find the better correspondence. Value propo-
sitions have been constructed to reflect and
4 Ulla Gain: Applying Frameworks for Cognitive Services in IIoT

Figure 1 The Abstracts and Keywords of the Found Articles in Scopus

Figure 2 The Major Journals that Published these Articles in Left and Major Publishers on the Right

to show the new technology-based possibili- 2. Smart things and Industry 4.0
ties and need. The represented technologies The disruptive technology or innovation (e.g.,
have been chosen as follows: The technology connected smart things) does not change the
choices are made within the practical need, business’s basic principles. For example, the
that consider the value propositions (i.e., the strategy principles remain valid as well as the
requirements of the example of the production rules of rivalry and competitive advantage (e.g.
line) and the technological capabilities avail- the five forces of competition Porter (2008)).
able based on a literature review (however, pre- Still, for example, the variety of competitive
senting various example alternatives of oper- ways will change (Porter and Heppelmann
ational technologies for the technology stack 2014). These smart things enable companies
functional requirements). and industries new competitive opportunities
Ulla Gain: Applying Frameworks for Cognitive Services in IIoT 5

Figure 3 Top 50 Thesaurus Terms of the Subject and Cloud of Words on the Subject Thesaurus Terms

and threats by reshaping the boundaries of repair.


the industry (Porter and Heppelmann 2014), • Autonomy (combined monitoring, con-
e.g., smartphones in camera markets have done trol and optimization): enables the au-
(Ovans 2015). More closely looking, the con- tonomous operation, enhancement and
nected smart things have physical parts (i.e., personalization of things, allows self-
mechanical and electrical parts). Smart com- coordination of operation with other
ponents (i.e., sensors, microprocessor, storage things and systems, self-diagnosis and
for data, controls, software, the embedded service.
operating system and the user interface) and
The combination of the automation and data
connectivity components (i.e., ports, antennae,
exchange, which includes cyber-physical sys-
and protocols (i.e., for wireless or wired con-
tems, IoT, cloud computing and cognitive
nections). Smart things can be grouped into
computing in manufacturing or engineer-
four functional levels:(Porter and Heppelmann
ing is called Industry 4.0 (e.g., related con-
2014).
cepts “Industrial Internet”, “Integrated Indus-
try”, “Smart Industry” and “Smart Manu-
• Monitoring (includes alerts, the notifica-
facturing”) (Hermann et al. 2016). In-
tion of changes): where sensor and ex-
dustry 4.0 design concerns interconnection
ternal data, make possible monitoring of
(e.g., wherein smart things is identified by
the condition of things, operations and
RFID tags, i.e., unique identification), informa-
external environment.
tion transparency, decentralized decisions and
• Control (software embedded in the cloud
technical assistance as follows (Hermann et al.
of smart things, or the thing): make the
2016):
possible, control of thing functions and
personalization of the user experience. • Interconnection corresponds connec-
• Optimization (enabled algorithms to op- tions IoT ( e.g., connected cars, smart
timize the operation and use of the thing) cities); Internet of People (IoP) (e.g.,
allows enhancing product performance, social media); and Internet of Every-
and predictive diagnostics, services and thing (IoE) (e.g., people, things, data,
6 Ulla Gain: Applying Frameworks for Cognitive Services in IIoT

Figure 4 An Example of a Technology Stack of Digitalized Industry 4.0. This article describes the technology
stack components, i.e., Smart things, Connectivity (authentication and security) Section 2.1. Application
platforms and analytic application Section 2.2, Digital Twin and knowledge base Section 2.3, Integrated
BI&A, Cognitive Services and External data sources (e.g., big data) are presented by way of examples
and connection with the industry 4.0 presentation

processes). It contains the cooperation and cybersecurity have special impor-


of human-human, human-machine and tance within interconnection between
machine-machine. ((Hermann et al. machines, devices, sensors and people.
2016); (Jaiswal 2015)) • Decentralized decisions combined with
• Merge of the physical and virtual interconnections enable the participants
world requires context-aware informa- of IoE to work maximal autonomous.
tion transparency; on the terms, it en- Within the event of disturbances (e.g., ex-
ables new interconnected objects and ception, conflicts) tasks move to a higher
people. The communication standards level (Hompel and Otto 2014). The ben-
Ulla Gain: Applying Frameworks for Cognitive Services in IIoT 7

efits of decentralized decisions are that nected node acts both ways, i.e., as server and
decisions can be made based on local client for the other members of the net. The
or global information, which confers bet- nodes in the blockchain network ensure the
ter decision-making and increases over- validity and relaying transactions. Blockchain
all productivity (Malone 1999). is constructed from the blocks. The block
• The technical assistance function needs has a key in the present, and previous data
to be supported by the assistance system, structure, Figure 5, these key values construct
which aggregates and visualize compre- a chain.In the chain, each key is established
hensible information form to interact and from the content of the block. Therefore,
solve problems in short notice (Gorecky manipulating the content would break the
et al. 2014). chain since manipulation would change the
Industry 4.0 connects people, systems, pro- key in the block, and the changed block would
cesses and physical assets, which presents im- not be accepted in the chain (Bitcoin 2017).
provements to industrial processes “involved The distributed ledger (i.e., generally agreed,
in manufacturing, engineering, material usage shared, synchronously replicated, digital
and supply chain and life cycle management” data) can be private or public and varied in
(Hermann et al. 2016). Industry 4.0 can benefit size and structure. This technology supports
from intelligent assets. Such as a product, ser- transparency and auditing, as distributed data
vice or process that have the sensors and cog- can provide a certify-of-origin of the smart
nitive capabilities for self-diagnosis and com- thing. Also, smart contracts (e.g., Ethereum)
municate and thus optimize the performance that are applicable in remote systems man-
and maintenance, through that reduce down- agement and maintenance. Smart contracts
time and increase efficiency and support re- on the Blockchain contain objects, such as
pairs (see a model factory ((IBM 2018); (Dub- code functions, which can interact with each
ley 2018)). Besides, cognitive processes’ abil- other (e.g., monitor, make decisions, store
ity to analyze structured and unstructured data). Moreover, they are executable as long
data can provide insights that can lead to im- as the blockchain network exists, they stay
proved safety, quality and operations in manu- executable unless they are not programmed
facturing and engineering process (see Indus- self-destructive (Ethereum 2018). The core
try 4.0 cognitive manufacturing with Watson concept of Ethereum blockchain is - Ethereum
IoT (Watson 2016)). Virtual machine -, which is Turing complete,
in the meaning of competent, it simulates
2.1 Communication and Interconnections
all programmed instructions accomplished
Interconnections of the smart things require
by computers if the memory limitations are
detailed special know-how; they should be
ignored.Admission of the awareness, through
considered when planning solutions. The
the support of the omnichannel approach.
biggest downside of interconnections is pri-
Omnichannel means that the multi-channel
vacy and security, “IoT is hackable”, and it
connections need to work together, giving
suffers from the internet outage of mobile and
the customer a seamless compiled experience
IoT devices (Lager 2017). The blockchain
at any connection (see the figure Determine
technology can elaborate on the cybersecurity
your Cross-Channel Strategy, available online:
and standards for the industry 4.0 intercon-
https://blog.hubspot.com/marketing/omni-
nection between machines, devices, sensors
channel-user-experience-examples) (MGI
and systems. Blockchain technology is based
2017). For example, IoT based omnichannel
on the peer-to-peer network, where each con-
8 Ulla Gain: Applying Frameworks for Cognitive Services in IIoT

Figure 5 The Example of Blockchain, where Each Block has Two Keys. A present block key is calculated from the
block content and the key of the previous block. (Adjusted from (Yle 2016)

expectations are described in the logistics 2.2 Processing to achieve agile, smart in-
service in Industry 4.0 by Tang et al. (2018). formation
Particularly, in the process industry, the
international initiative Data Exchange in the When the latency is a critical attribute of the
Process Industry (DEXPI), which proposes system (e.g., in time-critical applications), sys-
a transport standard for the multi-channel tems where devices are distant from the cloud
connections, i.e., aims to access omnichannel platform, the fog and edge computing are in-
functionality. DEXPI and ENPRO approach troduced to address these types of questions.
aims of harmonization interoperability of Devices of fog and edge intermediates between
software tools, and act to mediate between cloud-based application and smart things, they
diversity information silos (Wiedau et al. lower the transactions between the nodes and
2018). In industry 4.0, this means seamless ensure the construction of the value-adding in-
cooperation with the stakeholders of the sights on real-time (Raj and Raman 2017) (see
ecosystem. It requires context-aware informa- latency in analytics for fog computing (Cisco
tion transparency, for example, for assets life 2018a)).Fog computing, the flexible configura-
cycle services and maintenance, which follow tion of IoT, where all functions (i.e., processing,
the need optimization and the maintenance monitoring and controlling, data storage) can
strategy. For interoperability, standards are an be realized at any member device of the archi-
integral part of industry 4.0 (see "Supporting tecture, e.g., at the edge devices such as sensor,
the Interoperability of Industry 4.0 Standards ’smart thing’ e.g., the equipment itself, gate-
through Semantic Technology"), where RAMI ways, routers, or at cloud or data centre (Raj
4.0 provides an administrative shell (Mosch and Raman 2017), (McKendrick 2016). Fog
2020). computing originates from Cisco (see architec-
ture (Cisco 2018b)), a nearby term for edge
computing, meaning that analytics can be per-
formed at any level of the architecture. In con-
Ulla Gain: Applying Frameworks for Cognitive Services in IIoT 9

Figure 6 The Digital Gap of Smart Things. Digital Strategy is the mean to balance the industrial ambidexterity,
competition in the field, adaption and learning

Figure 7 Smart Things, Smarter Systems Add the Business Value for Industry 4.0 (VPC4I). Intelligent issues add
system features that create opportunities for direct and indirect savings, including new perspectives and
sources for innovation

trast, edge analytics are performed next to the to control the devices, for example, Nokia’s 5G
network’s edge (Raj and Raman 2017), (McK- demonstration of the mobile edge computing
endrick 2016), for example, IBM’ real-time an- environment (i.e., machine-to-machine inter-
alytics and data streaming, a modular model connection) wherein commands for the robots
which is on edge (IBM and Apache 2018). are controlled at edge computing (see Intro-
Fog computing combines IoT, 5G and AI sys- ducing 5G with NOKIA (Nokia 2017)). 5G
tems where security, cognition, agility, latency, will allow connections, for example, human to
and efficiency have high priority. That means human, actuators and sensors to the environ-
providing trusted transactions, “awareness of ment and between sensors. The specifications
client-centric objective to enable autonomy”, of it require to support existing services and
scalability, real-time computing and system emerging needs such as machine to machine
control, merging the unused local resources communications, also virtual -, augmented -,
of participate devices (OpenFog 2018). In Fog and assisted reality, the standardization time-
computing, real-time results and analytics are line of it is aimed at to offers 5G technology
solved various architectural approaches. Such by 2020 (more in detail (Blanco et al. 2017);
as in manufacturing field the combinations of (Spectrum 2017)).The cognitive services are a
hybrid cloud (private cloud, public cloud) (Wu part of AI, as well as intelligent applications.
et al. 2016), also approaches to reduce the Gartner predicts by 2018 that the most of the
amount of data transfer using fuzzy logic and 200 world’s largest companies will use intelli-
neural classifiers (Hazem et al. 2017).5G is gent apps within big data and analytic tools
an approach to solve the bandwidth problem, to take advantage of building and improving
where the network’s latency is lower. It allows the customer experience (Gartner 2018). The
10 Ulla Gain: Applying Frameworks for Cognitive Services in IIoT

customer’s engagement and quality of service and operational requirements, functional de-
have been seen as critical driving forces for sign, and asset specification and operational
competitive advantages (Ferreira et al. 2017). asset functions (Wiedau et al. 2018). The
For instance, IBM has, implemented cognitive industry-specific knowledge base is useful in
powered cloud services: Commerce, Supply the Industry 4.0 environment where indus-
Chain and Marketing Insights, which aim to try and product-specific technical assistance
strengthen customer engagement and provide and decentralized decision support is needed.
personalized offers and services (IBM 2017). Connected knowledge sources in the meaning
Especially, this paper experiments the capa- of the process control systems and BI systems
bilities of intelligent applications to match the with operating and smart thing data and smart
industrial business needs. Therefore, we use indicators, can offer agile opinion and, there-
Business Model Canvas (BMC), Lean and ca- fore, help rule over the situation.The modern
pabilities of value propositions canvas to find BI&A platform offers a business perspective
and match intelligent insights of cognitive ser- to the centralized and decentralized use cases,
vices from IIoT, knowledge retrieval to business it also covers wide ranges of technical capa-
objectives. bilities, for example, agile, interactive visual
exploration and - analysis, predictive and pre-
2.3 Digital Twin and knowledge base
scriptive analytics for applications and pro-
The strategic technology trend is developing
cesses (Howson et al. 2017).When the cogni-
towards expanding IoT into digital twins and
tive services of the process, service or product
– mesh. The digital twin is a dynamic digi-
and BI&A are interconnected, the system’ abil-
tized model of real-world physical things. It al-
ities provide users with the multifunctional in-
lows simulation with real data, enables added
terface (e.g., natural language interface, virtual
value through the management of assets, anal-
assistants, dashboards). And augment aware-
ysis, and controls, helps in troubleshoots, and
ness of the situation and present the optimized
improves things (see the presentation online:
recommendations to the possible courses of
(Mind and Machines 2017)). Mesh corre-
actions based on situational analysis and pre-
sponds to the intelligent digital ecosystem,
dictions. The results of cognitive services are
which provides dynamic connections between
probabilistic. They assign a confidence level
things, people and services through chosen
of the potential result. The cognitive systems
topology (Gartner 2018); (Ruiz-Garcia et al.
are to facilitate the human-computer interac-
2009); (Sparrow 2019); (TP 2019).The simu-
tion (i.e., understanding about human reason-
lation data for the digital twin creates a bridge
ing). As a result, to transform the data into
between the designed model and the planned
the form that humans can understand, e.g., to
process. It means that the integrated concepts,
support optimal decision-making. Therefore,
that implement international standardization
essential is the options of actions aligned with
in practice, that is a mean with achieving the
measures, that accord to the target values, fac-
interoperability in the process of integrating
tors that together constitute the definite goal.
the data structures. The data structures need
to use the same concepts (i.e., taxonomies and 2.4 Canvases, exploration and exploita-
principles) so that the functional and physi- tion
cal world is connectable. For example, in the Business models are “a set of assumptions
process industry, the plant and asset struc- or hypothesis” (Ovans 2015a), which define
ture information are connected forward and the business’s concerns. The most compre-
predictive -perspective, backward perspective hensive representation of these hypotheses is
Ulla Gain: Applying Frameworks for Cognitive Services in IIoT 11

Business Model Canvas (BMC) proposed by cess, for example partly automated systems,
Osterwalder (2012). His model contains where the onward process sets flexibility lim-
nine building blocks for the critical assump- its. These processes can benefit from Digital
tions about Key partners, Key activities, Key Twins, that creates an opportunity to experi-
resources, Value proposition, Customer rela- ment, thus unlimited opportunity to reach flex-
tionships, Channels, Customer segments, Cost ibility and autonomy for new ideas and test-
structures, and Revenue Streams. Further, he ing.Also, traditionally highly automated pa-
has supplemented BMC with plug-in Value per industry can benefit from the smart, con-
proposition canvas, which helps to analyze the nected assets and add value that differentiates
customers’ needs (Osterwalder 2012). The from the competitors in the market (Miklovic
Lean canvas of Maurya is adapted from BMC, 2017). Although, there are the legacy in-
so that: Problem replaces Key Partners, Solu- vestments in the machines of manufacturing
tion Key activities, Unique value proposition which are long term investment and are ex-
replace value propositions and Unfair Advan- pected to remain productive until the end, as
tage replace Customer Relationships, (Mau- well as equipment, sensors to produce the out-
rya 2017). These two canvases have com- come and data of industrial processes. Be-
mon blocks; they also have mutually comple- sides, the functional composition the system
mentary blocks (i.e., they do not rule out one experts, systems and procedures that follow
another). Our approach needs those supple- industrial processes have perfected workflows
ments because they emphasize the variety in and value chains. These functional entities
ambidexterity (i.e., exploration and exploita- competing for the limited resource to maintain
tion) and therefore, both approaches have been the performance (i.e., reduce the downtime)
introduced. The exploration is more em- on one hand systems optimization requires in-
phasized in Lean canvas (i.e., “entrepreneur- vestments in IIoT ready equipment. The pres-
focused” (Maurya 2017)) whereas BMC high- sure to renew towards the digitalization re-
lights exploitation (e.g., improvements), i.e., quirements means renovating legacy systems
both business models integrate the ambidex- and processes (i.e., exploitation) to optimize
terity. For example, the case study of Agius the technological advantage and explore in-
uses BMC to construct the data-driven busi- novative approaches (i.e., exploration) to op-
ness model for Industry 4.0 (Agius 2017). In timize problem-solving and decision making.
his study, the importance of the value propo- The exploration involves experimenting, trans-
sition can be observed. Therefore, in this parent communication, the learning process,
study, the value propositions of industry 4.0 the problems, and innovations should be seen
are brought out more in detail. through a lens of opportunities. The initiatives
of exploration can adopt a minimum viable ser-
2.4.1 Industrial 4.0 and Ambidexterity
vice/product/process/system. The combin-
Organizational ambidexterity, wherein ma-
ing exploitation of smart things (i.e., industry
ture technologies and markets emphasize ef-
4.0) the evolution of products and technolo-
ficiency, control and incremental improve-
gies with new and innovative exploration un-
ments, whereas new technologies and markets
derpins the manufacturing ambidexterity ca-
need flexibility, autonomy and experimenta-
pability, therefore both have the essential role
tions (OReally and Tushman 2013). However,
in the digital transformation. Therefore, this
in a manufacturing environment, some cases
paper presents a value proposition canvas for
the flexibility, autonomy and experimentations
Industry 4.0 (VPC4I), BMC and Lean canvases,
are very limited due to the nature of the pro-
12 Ulla Gain: Applying Frameworks for Cognitive Services in IIoT

Figure 8 Value Proposition Canvas for Industry 4.0 (VPC4I). The picture illustrates how the product, process or
service features must meet the stakeholders’ needs and add the value for the stakeholders and the
company. The circle illustrates continuous analysis and reinforcement loop

which can bring these issues more concrete. quire that the stakeholder’s activity problems
and needs are continuously analyzed and re-
2.5 The value proposition canvas for In-
sponded with an act of necessary measures.
dustry 4.0
The research literature review of Harmeling et
We adjusted the value proposition canvas to
al. (2017) and Kunz et al. (2017) of a customer
capture the essential fit for the need of the
engagement, emphasizes behavioural, psycho-
stakeholder in Industry 4.0. Wherein Value
logical and motivational psychology perspec-
proposition canvas is adapted from Oster-
tives. Furthermore, the customer engagement
walder (Ovans 2015a) as follows: Customer
involves product innovations, e.g. Whirlpool
Jobs are replaced with Stakeholder Activities
"Every Day Care Project (Harmeling et al.
(i.e., contains integrated, connected systems’
2017). The companies’ three most common
possibilities (i.e., human-human, human-
reasons to embrace IoT technology are the
machine, machine-machine)), Customer Gains
customer experiences and engagement, oper-
with Stakeholder Desires, Customer Pains
ational efficiency improvement, and action-
with Stakeholder Worries, Gain Creators with
able insights, that establish strategic decisions
Sources, Pain Relievers with Solutions, and
(Lager 2017). The IIoT era focuses on the
Product and Service with Product, Process
innovations of the product, service and pro-
and Service Figure 8. The descriptions of
cess, which can improve the existing and new
the value proposition fields are represented in
things as well as combinations of them (i.e.,
the company-stakeholder perspective, in the
exploitation and exploration). The feedback
form of questions and the Product, Process
data from smart things and the stakeholder to
and Service fit-answers are prompts Table 1.For
the company provide information on how well
ensuring the stakeholder’s engagement and
strategically the product, process and services
quality of service, process and or product re-
Ulla Gain: Applying Frameworks for Cognitive Services in IIoT 13

Table 1 The Descriptions of the Value Proposition Fields for VPC4I

Company Stakeholder Question Shared


value value
Functions, systems and Targets Which functions
The increase intelligence of product, service, process (i.e., Smart Things) Increase market insight

information transparency, augment awareness of the situation and present the optimized recommendations to the possible
Increase competitiveness Information about the public image for managing and maintenance

smart things, which help must be fulfilled?

Products, Process and Services

Increase performance and maintenanceImproved safety, quality and operations, increase awareness, Context-aware
courses of actions based on situational analysis and predictions, integrate knowledge into a learning environment
customer to accomplish the

Stakeholder activity
target
Find, how to improve sys- Key func- What are the crit-
tems (e.g. what information tions ical tasks for sys-
can help predict, optimize, tems?
anticipate and improve the
outcome)
Provide technical assistance Assistance Which tasks
systems and needed infor- or situations
mation, functions for techni- need technical
cal assistant use assistance
Provide functions and in- Decision Which
formation to assist decision making tasks/functions
making require decision
making?
Produce timely results (i.e., Timing What is the time
time-critical systems) frame for the re-
sults?
Provide smart things and so- Awareness Which functions
lutions with functionality to need monitoring,
increase awareness and tech- control and or au-
nical support tonomy
Analyze and name the func- KPIs What are the fac-
tional KPIs (i.e. the beneficial tors of the func-
factors of the functions) tional progress
Compliance with commonly Downstream, What information
agreed rules of data usage for Blend, Con- is allowed to
example downstream, ana- trol, Store downstream
lyze, blend, control and store
Analyze and provide infor- Actionable What information
mation based on actionable Insights is needed for
insights strategic decision
making
Support for transparency Transparency Where trans-
parency needed
to be added
Help the stakeholder to serve Services for Which basic
a need the needs needs your stake-
holder serves to
satisfy?
Satisfy the social and emo- Profile What social and
tional needs target, Emo- emotional needs
tional targets need to be ful-
filled?
14 Ulla Gain: Applying Frameworks for Cognitive Services in IIoT

information transparency, augment awareness of the situation and present the optimized recommendations to the possible
Analyze, simulate, predict Undesired What causes the
The increase intelligence of product, service, process (i.e., Smart Things) Increase market insight
Increase competitiveness Information about the public image for managing and maintenance

Increase performance and maintenanceImproved safety, quality and operations, increase awareness, Context-aware
information to find an oppor- cost undesired cost for

courses of actions based on situational analysis and predictions, integrate knowledge into a learning environment
Stakeholder worries
tunity for savings and avoid the stakeholder?
the unexpected costs.

Solution
Use measures to secure infor- Security What information
mation is considered con-
fidential
Remove the disadvantage Negative What causes un-
factors emotions desired feeling?
Develop a solution to meet Under- In what respect
the requirements (security, performing are the current
timing, functionality) solutions solution under-
performing for
your customer?
Analyze, simulate, predict Optimization What can be opti-
information to find an oppor- mized?
tunity to optimize
Settle difficulties and chal- Difficulties What kind of dif-
lenges and chal- ficulties and chal-
lenges lenges encounters
your stakeholder?
Develop a solution to meet Social effects What kind of
and treat negative social ef- negative social
fects (e.g. analyze and react consequences
the causes of the changes to does your stake-
the engagement and the sen- holder confront?
timent)
Analyze, simulate, predict, Risks What are the risks
minimize or eliminate the for your stake-
risks holder?
Exterminate the possibilities Mistakes What are the mis-
for mistakes (add under- takes of the stake-
standing, transform the data holder?
into the easy-to-understand
predictions, simulations (e.g.
visual form)
Find solutions over obstacles Barriers What prevents
(such as financial barriers, the stakeholder
avoid by creating win-to-win to adopting your
solutions) solution?
Ulla Gain: Applying Frameworks for Cognitive Services in IIoT 15

information transparency, augment awareness of the situation and present the optimized recommendations to the possible
Create direct and indirect Savings What type sav-
The increase intelligence of product, service, process (i.e., Smart Things) Increase market insight
Increase competitiveness Information about the public image for managing and maintenance

Increase performance and maintenanceImproved safety, quality and operations, increase awareness, Context-aware
savings for stakeholder and ings (i.e., re-

courses of actions based on situational analysis and predictions, integrate knowledge into a learning environment
Customers desires
company (i.e., present sav- sources) could

Sources
ings in the form of concrete make your stake-
indicators) holder happy?
Predict, simulate, implement Outcomes What expects
the outcomes that meet or go expectations your stakeholder,
beyond stakeholder expecta- (e.g. op- and what to
tions erational exceed expecta-
efficiency) tions?
Accomplish to the benefits of Benefits How are the
the current solutions. Im- current solutions
prove solutions with the help benefiting your
of smart things, and innovate stakeholder /
beneficial solutions and ca- What is good
pabilities in the current
solution?
Make it easier to reach the Relief How make your
targets (i.e. the comprehen- stakeholder reach
sible assistance systems and his / her targets
decision-making support) painless?
Construct the infrastructure Ecosystem Does the solution
of the ecosystem which assist exist partly or en-
in fulfilling requirements of, tirely?
i.e., the smart thing, product,
process and service
Create and maintain a de- Social image What are the de-
sired social image for a prod- sires of social con-
uct or service, continuously sequences?
analyze sentiment and en-
gagement of the stakeholder
further respond by acts of
necessary measures
Pursue and implement stake- Seek/Dream What are your
holder’s vision stakeholders
searching?
Match your outcomes to your Indicators How success and
own and your stakeholder’s failure are mea-
indicators sured at the stake-
holder?
Yield up the things which Adoption What would as-
make adoption easier, sist the solution
present context-aware trans- adoption?
parent, augmented and
actionable information to
use
16 Ulla Gain: Applying Frameworks for Cognitive Services in IIoT

meet the stakeholder’s targets and which func- needs of the product, the process and ser-
vice ambidexterity in industrial business (new
3. Cognitive services, canvases and smart object development (i.e., exploration)
and existing smart object improvement (i.e.,
Industry 4.0
exploitation)). Therefore, we explore the gen-
Cognitive services are building blocks for, for
eral characteristics of cognitive services at the
example, systems which imitate human mind
BMC, Lean and value proposition canvas to
processes. These services enhance the au-
form an overall picture of the necessary is-
tomation features of systems and enable the
sues in this context. When we conducted our
construction of the systems that release re-
mappings cognitive services capabilities to the
sources or where the interdependence of hu-
value propositions for Industry 4.0, we real-
man and technological solutions are needed
ized that all the product issues, the process and
(IBM 2017a). As a part of AI, cognitive ser-
service (smart things) could benefit from cog-
vices include algorithms and systems such as
nitive properties. And would be equally prof-
natural language processing, machine learn-
itable for things development and new things
ing, deep learning, and neural networks. That,
(i.e., exploration and exploitation). In this con-
in turn, create the abilities of solutions such
text, we emphasize the importance of matching
as learning, identifying speaker or an image,
your results for your stakeholder’s indicators
understanding natural languages, identifying
as they can be used as benchmarks (e.g., in the
the social tendencies of the text (i.e., joy, anger,
analysis for evaluating performance or quality)
disgust, fear, sadness) and predict personal-
and increase transparency.Further, we mapped
ity traits from textual content (IBM 2017b).
Figure 9. Industry 4.0 (i.e., smart things, big
The combination of large parallel processing
data and cognitive services) characteristic is-
power (i.e., cloud computing), sophisticated
sues to the BMC. When we conducted our
algorithms (i.e., cognitive services), and vast
mappings (i.e., Industry 4.0 to the BMC), we
data sets (i.e., big data) constitute the three ba-
realized that each activity, so that they could
sic pillars of intelligent systems. This increase
be functional, need resources, e.g. the Cloud
the role of cloud platform and the requirement
of the things needs to perform functions, and
of inter-connectivity and connections essential
those functions need resources.Further, all the
for Industry 4.0, since it integrates function-
actors involved in forming the digital ecosys-
alities, a vast amount of information, cogni-
tem. The infrastructure of the ecosystem con-
tive services and smart things.Cognitive ser-
sists of services and structures that enable in-
vices for knowledge retrieval can use customer
dustry 4.0 to function. Infrastructure is di-
annotator components for the special need of
vided into technical and social; hence, it en-
industry (IBM 2017c). These features (e.g.,
gages with the ecosystem (i.e., interconnec-
industry-specific knowledge base) can be use-
tions and systems between machines, devices,
ful in the Industry 4.0 environment where
sensors and people). In the Cloud of the things,
industry and product-specific technical assis-
analytic applications can integrate disparate
tance and decentralized decision support is
data sources and allows the analysis of inner
needed. See the examples of use cases of IoT
related business impact. That supports the
to continue the improvements of the product
business decisions since build-in analytic sys-
(Watson 2016), asset management for value
tems can provide asset monitoring (e.g., the
propositions (Watson 2017).We wanted to ex-
state of assets, risks), predictive and preventive
periment with the ability of intelligent appli-
maintenance, situational analysis (e.g., aware-
cations (i.e., cognitive services) to meet the
Ulla Gain: Applying Frameworks for Cognitive Services in IIoT 17

Figure 9 BMC for Specific Needs of Industry 4.0

Figure 10 Lean Canvas for Specific Needs of Industry 4.0

ness, decision support).In the next experiment, be, for example, the capabilities of Cognitive
we mapped (Figure 10) Industry 4.0 character- Services. Such as the system’s ability to learn
istics of the Lean canvas (i.e., a new smart thing, and classify, help to provide personal advice
exploration) and the characteristic blocks: (i.e., and proposal that contains the arguments of
Problem, Solution, Key metrics, Unique value savings, or ability to monitor the engagement
propositions and Unfair advantage). We real- of their stakeholders (e.g., to give early insights
ized in our mappings that Unique value propo- or to show continued development). ‘Offers’ of
sitions, obviously need ’a smart thing’ propo- ’cognitive services’ with solutions that can help
sition (i.e., a marketing promise, which ties the find and meet customers and stakeholders’ so-
problem and the solution together); otherwise, cial and emotional needs (e.g. 113 Industry
there is no possibility for a unique value. These (2017) and personality insight service). The
smart things’ arguments for stakeholder could services that transform the data of smart things
18 Ulla Gain: Applying Frameworks for Cognitive Services in IIoT

Figure 11 The Example of Lean Canvas of Quality Enhancements using Smart Things

into the form that supports human reasoning gives the means of defence against competitors
offer outcomes that support optimal decision- and copycats. For example, the key resource
making, which add awareness and autonomy can be an unfair advantage if it serves as a de-
and help the user accomplish his / her tar- fence (Maurya 2017). We realized the meaning
get.The key metrics are the indicators toward of, that advanced technological system, e.g. In-
the strategic goal, and they should be the value dustry 4.0 things can be served as unfair advan-
metrics that later measure your key factors tages Figure 7. Often, the problem initiates cre-
of growth (Maurya 2017). The Key metrics ating the solution when we have a direct stake-
are case-specific indicators, meaning that they holder relationship (e.g., omnichannel). We
need to be measurable and anchored on the observe / interview stakeholders (i.e., retrieve
objective insights (i.e., transparent, augmented stakeholder information). This way identifies
and actionable information). Cognitive analyt- the problems that further lead to a solution. By
ics help analyze the manufacturing data (e.g., analyzing the stakeholders’ data, we can trace
smart things, process, systems) to provide the the stakeholder segments by cases (i.e., intelli-
predictions of the outcome and find the fac- gent cognitive solutions).
tors that affect the key metrics, thus finding
the issues to improve the results. Besides, it 4. Sound and Visual Recognition
supports the risk analysis and can provide so- IBM Watson Visual Recognition analyzes im-
lution proposals for minimizing the risk (e.g., ages with deep learning algorithms; these al-
the predictive maintenance alerts for failures) gorithms can be used to create custom classi-
and finding the causes of the persistent fail- fiers, which are tailored to the specific needs
ures, thereby optimising the performance.The (IBM 2017d). The service analyzes images
cognitive systems accept natural language in- with algorithms, that are backward chaining
teractions, e.g., repair feedback can be given as neural nets. These learning algorithms pro-
spoken description, which can further improve gressively infer patterns about an ensemble
the system predictions. Unfair advantage is of data by starting with the goal and setting
a competitive advantage, or barriers to entry, defined rules that can be extended to other
goals (Earley 2015). The service can iden-
Ulla Gain: Applying Frameworks for Cognitive Services in IIoT 19

Table 2 The Example Embedded Descriptions of Embedded Canvases for the Production Line Issues

Company Stakeholder Question Shared


value value
Activities of Smart The soundcheck result Which func-
The increase intelligence of product, service, process (i.e., Smart Things) Increase market insight

information transparency, augment awareness of the situation and present the optimized recommendations to the possible
Increase competitiveness Information about the public image for managing and maintenance

things: -Monitoring in range 0,97-1,00 the tions must be

Products, Process and Services

Increase performance and maintenanceImproved safety, quality and operations, increase awareness, Context-aware
courses of actions based on situational analysis and predictions, integrate knowledge into a learning environment
(M)to expose the colour and form check fulfilled?

Stakeholder activity
non-visual damages, result in range 0,95-
incorrect colour and 1,00. Monitoring tar-
form get: 100% of the prod-
ucts will be inspected,
and information about
the results are collected
and analyzed.
-Control (C) the quality Control target: 100%
(pull apart non-valid of the non-worthy vases
products) are recycled, data is col-
lected.
-Optimization (O), the Optimization target
results of the smart (the reduction per-
things with analytics centage of rejections):
allow finding the opti- Set the optimal raw
mized settings for min- material purity (20%),
imizing the non-valid Set the optimal oven
quality temperatures for the
materials (20%), Es-
tablish optimized
forming (20%), colour-
ing and glazing settings
(20%), Optimization
of production line
maintenance, i.e., qual-
ity problems caused
by the failure of the
production line (3%)
-Automation(A), in- Automation target:
terconnected smart 95% automatic quality
things enable auto- inspection (the inter-
matic quality inspec- ventions of human
tion -Identification 5%).Maintenance tar-
(RFID) Cloud of things get: reduction breakage
activities: Analytics, during production
Assistance, Intel- (100%) Analytics target:
ligence functions, Collected data analysis
MCOA-functions for 100% Assistance target:
the things,Security In- 100% Support for in-
terconnection between terventionsThe security
smart things, Aggre- target: 100% secure
gate and visualize connections.
information
20 Ulla Gain: Applying Frameworks for Cognitive Services in IIoT

Sources

Customers desires
Create direct and indi- Savings:Save on work- What type
rect savings for stake- ing time 4%/ year due savings (i.e.,
holder and company to a lesser amount resources)
(i.e., present savings in of reclamations. The could make
the form of concrete in- measures: working your stake-
dicators) time spent on returned holder happy?
vases (i.e., reclaims),
average reclamation
time per reclaimed
unit (7 min/vase, for
ten units, avoided
reclamations make
1,16 h working time
savings)).Reducing
logistics costs, since
broken and replace-
ment vases will not
be sent back and forth
15%/month. Produc-
tion line improvements
(30%), based on anal-
ysis of collected data
from the production
line (i.e., smart things).
Production line main-
tenance, the reduction
of downtime (12%) and
rejection cause (2.5%)

tify, e.g., food, colours, and further categorise fore, we proposed equipping the production
and tags the images. We use this service line sorting with smart things: sound recogni-
in a manufacturing environment to provide a tion and Visual recognition (Figure 12). In our
classifier between damaged and prima prod- example, the product line categorization to the
uct. Our example comes from vase produc- damaged and prima product happens in two
tion, where claims about product bad qual- phases. First, Sound Recognition is used to ex-
ity have occurred in Figure 11, and where the pose the non-visual damages with a sound test
company wanted to find a solution to improve and after Visual Recognition to ensure the cor-
the quality. Because the decline in the con- rect colour and form (Figure 12).Sound Recog-
dition of the product’s quality can be noticed nition encompasses the verification and iden-
in key metrics, e.g., the number of Customer tification of the sound. The sound verification
complaints per 1000 delivered products, Ex- underpins the sound’s characteristics (i.e., the
changed or returned products per First pass sound signature of the prima sound product),
yield, and Process waste levels have been ris- which is used for clustering the products based
ing.The first step towards improving the qual- on the similarities of their sounds. Further,
ity in the production line was the sorting of sound identification compares the audio input
finished products between defective (i.e., re- (i.e., sound) with the provided group of prima
cycling material) and prima products. There- product sound. If the match of the sound is
Ulla Gain: Applying Frameworks for Cognitive Services in IIoT 21

Figure 12 Vase Production Line with Smart Things Sound and Visual Recognition.

found, then the product is identified as prima 5. Discussion


sound quality. The accepted results are in the This paper aims to bring the technological ad-
predefined range of correlations (e.g., if the vantages of digitalization closer to business ob-
sound corresponds 0,97 of the intact product jectives. Firstly, this paper provides an exam-
sounds, then it is valid).Furthermore, ensure ple of a data and business origin approach
the correct colour and form Visual Recognition to utilizing a new technology. Secondly, it
uses a neural network to infer the rules to cate- provides an example framework to achieve
gorize the right colour combination and prima measurable advantages in digitalization. The
product form (i.e., custom classifiers). The ser- frame is reusable. Moreover, the reuse is easy
vice results are represented by a score, where to conduct, because the base frame is already
score range varies between 0 and 1 (i.e., better well known; in other words, the modifications
correlation gets higher scores). We use these and supplements of it concern industry 4.0 and
quality scores to match against the company’ smart things perspective. In general, a formal
colour and form requirements (e.g., under 0,95 approach (i.e., framework) helps with the sys-
scored coloured products will be rejected from tems’ governance, especially when introduc-
the prima product category). Digital informa- ing intelligent things that enhance system ca-
tion of these smart services is therefore used pabilities that create opportunities for direct
to make the right decisions and actions to im- and indirect savings, including new perspec-
prove the quality. Further, our approach to tives and sources of innovation (Figure 7). The
verifying the product quality supports, the transparent data-based reasoning (i.e., with in-
process analysis, where the smart things re- dicators and measurements) will be more eas-
sults are used to build the product quality into ily accepted in a conservative development en-
the production process. The meaning of that vironment. Therefore improvement of the sys-
sound rejection results focuses on raw mate- tems will confront fewer barriers. Thus, the
rial, the temperature of the firing process. In approach is right of practical importance.There
contrast, visual results analysis focus on form- is a confusing amount of technical choices (i.e.,
ing, colouring and glazing processes. In Table high-tech umbrellas), offering alternatives to
2 the example descriptions of embedded BMC, implement, integrate and control where we
Lean Canvas and VPC4I the Value proposition need to find strategic opportunities and meet
fields for the production line quality issues. the industry’s challenges. Technology pro-
vides different approaches such as cloud, fog
– and edge computing, digital Twins, wherein
22 Ulla Gain: Applying Frameworks for Cognitive Services in IIoT

Table 3 The Summary Table of Technology/Business Concepts

Key concepts Descriptions

Value proposition canvas A value proposition is “the promise of measurable benefits resulting
from the collaboration” (ISO ORG 2017). Value propositions have been
constructed to reflect and to show the new technology-based possibilities
and need in Industry 4.0
Business Canvases Business canvases are a presentation of a business model that describes
financial streams, value propositions, key partners, activities, resources,
stakeholder relationship, channel and segments, and specific problems,
solutions, key metrics, competitive advantage or way protect the busi-
ness. Canvases are used to find and match intelligent insights of cognitive
services from IIoT, knowledge retrieval to business objectives
Ambidexterity The concept in the form of exploitation and exploration illustrates the
tensions in the business model that were often caused by limited re-
sources. New technologies challenge the organizations they need to find
balance in exploitation and exploration.
Smart thing SMART a.k.a. “Self-Monitoring, Analysis and Reporting Technology for
prediction of device degradation and, or faults” (ISO ORG 2009), in this
context, the device can be seen in product, process and services.
Industry 4.0 The combination of the automation and data exchange includes cyber-
physical systems, IoT, cloud computing and cognitive computing in man-
ufacturing or engineering.
Technology stack Presentation of needed solutions to be delivered through the technical
components and services that enable reach a target
Blockchain “Distributed ledger with confirmed blocks organized in an append-
only, sequential chain using cryptographic links” (ISO ORG 2020a).
Blockchain is introduced to add security in interconnections and con-
nectivity.
Cognitive service A set of algorithms that are designed to solve problems which mimic
human functions.
Knowledge base “Facts, information and skills acquired through research, experience,
reasoning or education on a specific topic. As a set of a declarative,
hierarchical organization of such statements. And relationships between
declarative statements, which serves as the underpinning of decision
support systems” (ISO ORG 2020b).
“Database that contains inference rules and information about human
experience and expertise in a domain. In self-improving systems, the
knowledge base additionally contains information resulting from the so-
lution of previously encountered problems” (ISO ORG 2015). A knowl-
edge base concept is presented to augment intelligence in solutions.
Digital Twin “The compound model composed of a physical asset, an avatar and an
interface” (ISO ORG 2020c). “The digital asset on which services can
be performed that provide value to an organization. The descriptions
comprising the digital twin can include properties of the described asset,
IIOT collected data, simulated or real behaviour patterns, processes that
use it, software that operates on it, and other types of information.” (ISO
ORG 2020d). A digital twin concept is presented to show the capability
to simulate a real situation, and therefore it can promote the needed
support or decisions.
Ulla Gain: Applying Frameworks for Cognitive Services in IIoT 23

Blockchain “Distributed ledger with confirmed blocks organized in an append-


only, sequential chain using cryptographic links” (ISO ORG 2020a).
Blockchain is introduced to add security in interconnections and con-
nectivity.
Smart-contract “A computer program stored in a distributed ledger system wherein the
outcome of any execution of the program records on the distributed
ledger” (ISO ORG 2019). Smart-contract is part of the blockchain and
has been introduced to supplement edge and fog computation.
Omni-channel An approach that gives seamless experience for the user at any point
of connection. The target of the use is to reach augmentation of the
awareness, through the support of the omnichannel approach
Fog computing A flexible configuration where all functions can be performed at any level
of the architecture. Presented to follow and fulfil the different needs of
the process and receive agile, smart information.
Edge computing “Distributed computing in which processing and storage take place at
or near the edge, where the system’s requirements define the nearness”
(ISO ORG 2020e). Presented to follow and fulfil the different needs of
the process and receive agile, smart information.
BI&A Business intelligence and analytics

they all contribute to Industry 4.0. When we itoring and controlling, data storage) can be
explored cognitive services capabilities to the realized at any member device of the archi-
value propositions for Industry 4.0, we realized tecture. That is the idea, to keep data and
that smart things could benefit from cognitive processing near the end-user, for instance for
properties. They would be equally exploitable the time-critical applications, and to provide
to existing and new things development (i.e., benefits such as real-time analytics and en-
exploration and exploitation). Further, in the hanced security (Raj and Raman 2017). Be-
Cloud of the things, analytic applications can sides, the emerging 5G bandwidth will offer
integrate disparate data sources and analyse the benefits of these approaches since it aims
inner-related business impact, which supports to satisfy the existing and coming needs of IoE.
the business decisions. Also, it allows stake- Furthermore, Blockchain has the means to en-
holders to have a personalized, seamless ex- sure the security and data audition of inter-
perience of products, processes and services connections and connectivity.Summary table
in different contexts.A Digital Twin model in- of technology/business concepts of this paper.
tegrated into the Industry 4.0 ‘blow up’ the The Table 3 gathers the key concepts and de-
data-driven strategy’s possibilities since it can scriptions of them. Digital intelligence may
support 100% of transparency and objective not exist for specific industry solutions but
insights. It continuously updates itself to re- must be experimentally discovered and devel-
flect the ‘thing’ condition, and it uses all col- oped. So, to move towards smart things, sen-
lected data for learning from similar condi- sors and components with the capacity to pro-
tions. When it is integrated into BI-systems, cess data must have embedded intelligence for
it can evaluate the opportunities and benefits, the needed functions (e.g., intelligence at the
and as an outcome, it gives options with rea- edge, intelligence on a cloud, fog intelligence).
soning based on simulations.In turn, fog and The example of the production line has the
edge computing the flexible configuration of exactly predefined measures (i.e., where func-
IoT, where all functions (i.e., processing, mon- tions are divided by activities and further into
24 Ulla Gain: Applying Frameworks for Cognitive Services in IIoT

functional measures) for the acceptable qual- DeFranco, J F and Laplante, P A (2017). A content analysis
ity (i.e., explicit target) that permit intercon- process for qualitative software engineering research.
Innovations Syst Softw Eng 13: 129.
nected smart things to collaborate, take a deci-
Dubley Sarah IBM, (2018), How the iot will change man-
sion and an action. They augment production
ufacturing operations, IBM May 2017, Online: https:
line transparency, awareness, and enable sit- //www.ibm.com/blogs/internet-of-things/change-ma
uational analysis. Generally, business models nufacturing-operations/.
and VPC4I support research and exploitation Earley S (2015). Cognitive computing, analytics, and per-
in the Industry 4.0 (i.e., the construction and sonalization. IT Pro July/August:12-18.

improvement of smart digital products, pro- Ethereum, (2018), Create a digital greeter, Online: https:
//www.ethereum.org/greeter.
cesses and services). After these experiments,
Fernando A F Ferreira, João J M Ferreira, Cristina I M A
we can see that the perspective of the phe-
S Fernandes, Ieva Meidutė-Kavaliauskienė, Marjan S
nomenon is versatile, and thus the approach Jalali (2017). Enhancing knowledge and strategic plan-
offers a complementary view. Used qualitative ning of bank customer loyalty using fuzzy cognitive
research methods in this paper are commonly maps. Technological and Economic Development of Econ-
omy 23(6).
used in exploring usage /operations technol-
Ferrucci D and Lally Adam (2004). UIMA: An architectural
ogy transfer and systems evaluations and the
approach to unstructured informationprocessing in the
content analysis of the software engineering. corporate research environment. Journal of Natural Lan-
guage Engineering.
Acknowledgments Few Stephen (2015). Signal ISBN: 978-1-938377-05-1.
Special thanks to the referees for their help in Gartner (2017). 5 steps to address iot integration chal-
improving the quality of the paper. lenges, May 4, 2017. Online: http://www.gartner.c
om/smarterwithgartner/five-steps-to-address-iot-int
egration-challenges/
References
113 Industry (2017). Online: http://www.113industries.c Gartner (2018). Gartner’s top 10 strategic technology trends
om/technology/ibm-watson/. for 2019. Online: https://www.gartner.com/smarterwi
thgartner/gartner-top-10-strategic-technology-trends-
Aaron Agius (2017). 7 inspiring examples of omni-channel
for-2019/
user experiences.Online: https://blog.hubspot.com/ma
rketing/omni-channel-user-experience-examples. Garthet, (2020): Gartner identifies the top 10 strategic tech-
nology trends for 2020. Online: https://www.gartner.c
Bitcoin, (2017). Bitcoin developer guide.Online:https://bit
om/en/newsroom/press-releases/2019-10-21-gartner-i
coin.org/en/developer-guide#block-chain.
dentifies-the-top-10-strategic-technology-trends-for-2
Bego Blanco, Jose Oscar Fajardo, Ioannis Gian- 020.
noulakis, Emmanouil Kafetzakis, Shuping Peng, Jordi
Glass R, Ramesh V, Vessey I (2004). An analysis of re-
Pérez-Romero, Irena Trajkovska, Pouria S. Kho-
search in computing disciplines Indiana University
dashenas, Leonardo Goratti, Michele Paolino, Evange-
USA. Sprouts: Working Papers on Information Systems
los Sfakianakis, Fidel Liberal, George Xilouris (2017).
4(23). http://sprouts.aisnet.org/4-23.
Technology pillars in the architecture of future 5G mo-
bile networks: NFV, MEC and SDN. Computer Standards D Gorecky, M Schmitt, M Loskyll, D Zühlke (2014).
and Interfaces 54(4):216-228, ISSN 0920-5489. Human-machine-interaction in the industry 4.0 Era.
12th IEEE International Conference on Industrial Informat-
Canvas (2012). Online: http://businessmodelalchemist.co
ics (INDIN) 2014, pp. 289-294, Porto Alegre, RS, Brazil,
m/blog/2012/08/achieve-product-market-fit-with-our
July 27-30, 2014.
-brand-new-value-proposition-designer.htm.
Cisco (2018a), Analytics for fog computing cisco. Online:ht Graves A, Wayne G, Danihelka I (2014). Neural turing ma-
tps://532386f9a72d1dd857a8-41058da2837557ec5bfc3 chines. CoRR Available: arXiv:1410.5401.
b00e1f6cf43.ssl.cf5.rackcdn.com/wp- content/uploa Hazem M, Raafat M, Shamim Hossain Senior, Ehab Essa,
ds/2016/03/fog2.jpg. Samir Elmougy, Ahmed S Tolba, Ghulam Muhammad,
Cisco (2018b): Fog architecture. Online: https://532386f9a Ahmed Ghoneim (2017). Fog intelligence for real-time
72d1dd857a8-41058da2837557ec5bfc3b00e1f6cf43.ssl.cf iot sensor data analytics, IEEE Access 2017;5:24062-9.
5.rackcdn.com/wp-content/uploads/2016/03/fog1-e14 Harmeling, Colleen M, Moffett, Jordan W, Arnold, Mark J,
58663437311.jpg. Carlson, Brad D (2017). Toward a theory of customer en-
Ulla Gain: Applying Frameworks for Cognitive Services in IIoT 25

gagement marketing. Journal of the Academy of Marketing ISO ORG (2020a). Terms and definitions — blockchain,
Science 45(3):312-335. ISO 22739:2020(en), ISO 22739:2020(en), Blockchain and
M Hermann, T Pentek and B Otto (2016). Design princi- distributed ledger technologies — Vocabulary.
ples for industrie 4.0 scenarios. 2016 49th Hawaii Inter- ISO ORG (2020b). Terms and definitions — knowledge
national Conference on System Sciences (HICSS) Koloa, HI, base, ISO/TS 22756:2020(en) ISO/TS 22756:2020(en),
pp. 3928-3937, 2016 doi: 10.1109/HICSS.2016.488. Health Informatics — Requirements for a knowledge
M ten Hompel, and B Otto (2014), Technik für die base for clinical decision support systems to be used in
wandlungsfähige logistik industrie 4.0. 23. Deutscher medication-related processes.
Materialfluss-Kongress 2014.
ISO ORG (2020c). Terms and definitions — digital twin,
Howson Cindi, Sallam Rita L, Richardson James Laurence, ISO/TR 24464:2020(en), ISO/TR 24464:2020(en), Au-
Oestreich Thomas W, Tapadinhas Joao, Idoine Carlie J tomation systems and integration — Industrial data —
(2017).Critical capabilities for business intelligence and Visualization elements of digital twins.
analytics platforms. Gartner Online: https://www.gar
tner.com/doc/reprints?id=1-3UOZ5IZ&ct=170302&st= ISO ORG (2020d). Terms and definitions digital twin,
sb ISO/TS 18101-1:2019(en), ISO/TS 18101-1:2019(en),
Automation systems and integration — Oil and gas in-
IBM (2018). Model factory. Online:https://www-935.ibm.c
teroperability — Part 1: Overview and fundamental
om/industries/manufacturing/industry-4.0-model-fa
principles.
ctory/.
IBM and Apache (2018). Data streaming at the edge: IBM ISO ORG (2020e). Edge computing — edge com-
and Apache Quarks. Online: https://www.rtinsights. puting, ISO/IEC TR 23188:2020(en), ISO/IEC TR
com/data-streaming-edge-analytics-apache-quarks/. 23188:2020(en), Information technology — Cloud com-
puting — Edge computing landscape.
IBM (2017). Watson customer engagement. Online: https:
//www.ibm.com/watson/customer-engagement/(acce Janet A Jaiswal (2015). IoE vs IoT vs M2M: What’s the
ssedon21September2017). Difference and Does It Matter? Sept 2015, Online: http:
IBM (2017a). Watson products and services. Online: https: //blog.aeris.com/ioe-vs.-iot-vs.-m2m-what-s-the-diffe
//www.ibm.com/watson/products-services/. rence-and-does-it-matter.

IBM (2017b). Watson knowledge studio deep dive. On- A Kaushik, Sybex (2009). Web analytics 2.0: the art of
line:https://www.ibm.com/us-en/marketplace/superv online accountability and science of customer centricity.
ised-machine-learning/resources#product-header-top. John Wiley and Sons Dec 30: 37.
IBM (2017c). Watson continuous engineering for the iot. Kelly John E. (2015), Computing, cognition and the future
Online: https://www.ibm.com/internet-of-things/iot-s of knowing how humans and machines are forging a
olutions/product-development/. new age of understanding. IBM 2015 Online:"http://res
IBM (2017d). Watson developer cloud, visual recognition. earchweb.watson.ibm.com/software/IBMResearch/mu
Online https://visual-recognition-demo.ng.bluemix.n ltimedia/Computing_Cognition_WhitePaper.pdf".
et/.
Werner Kunz, Lerzan Aksoy, Yakov Bart, Kristina
ISO ORG (2009). Definitions and abbreviations — SMART Heinonen, Sertan Kabadayi, Francisco Villarroel
ISO/IEC 24739-1:2009(en), ISO/IEC 24739-1:2009(en), Ordenes, Marianna Sigala, David Diaz, Babis
Information technology — AT Attachment with Packet Theodoulidis (2017). Customer engagement in a big
Interface - 7 — Part 1: Register Delivered Command data world. Journal of Services Marketing 31(2):161-171.
Set, Logical Register Set (ATA/ATAPI-7 V1). https://doi.org/10.1108/JSM-10-2016-0352.
ISO ORG (2015). Terms and definitions — knowledge base
Lager M (2017). Pint of view, the internet of things, linking
ISO/IEC 2382:2015(en) ISO/IEC 2382:2015(en), Infor-
machine to machine, separating hype from hope, Infor-
mation technology — Vocabulary.
mation Today, Inc, Medford, Customer Relationship Man-
ISO ORG (2017). Terms and definitions — value propo- agement, February 2017 Online: http://www.destination
sition, ISO 44001:2017(en), ISO 44001:2017(en), Collab- crm.com/Articles/Columns-Departments/Pint-of-Vie
orative business relationship management systems — w/The-Internot-of-Things-116097.aspx.
Requirements and framework.
T W Malone (1999). Is ‘empowerment’ just a fad? con-
ISO ORG (2019). Terms and definitions — smart con-
trol, decision-making, and information technology. BT
tract, ISO/TR 23455:2019(en), ISO/TR 23455:2019(en),
Technol J 17(4):141-144.
Blockchain and distributed ledger technologies —
Overview of and interactions between smart contracts Ash Maurya (2017). Why lean canvas vs business model
in Blockchain and distributed ledger technology sys- canvas? Online: https://blog.leanstack.com/why-lean
tems. -canvas-vs-business-model-canvas-af62c0f250f0.
26 Ulla Gain: Applying Frameworks for Cognitive Services in IIoT

McKendrick Joe (2016). Fog computing: a new iot archi- Scott Anthony (2015), When it comes to digital innovation,
tecture? Online: https://www.rtinsights.com/what-i less action, more thought, Harvard business review Jan
s-fog-computing-open-consortium/. 2015, Online: https://hbr.org/2015/01/when-it-comes-t
MGI (2017). A future that works: automation, employ- o-digital-innovation-less-action-more-thought.
ment, and productivity. McKinsey Global Institute Jan- Penna Sparrow (2019). Mesh topology: advantages and
uary (2017), Online: http://www.mckinsey.com/global disadvantages. Online: https://www.ianswer4u.com/2
-themes/digital-disruption/harnessing-automation-f 011/05/mesh-topology-advantages-and.html.
or-a-future-that-works.
Spectrum (2017). Everything you need to know about 5G,
Dan Miklovic (2015). Iiot, smart connected assets, and the IEEE Spectrum February 2017, Online: https://www.y
pulp and paper industry. Online: http://blog.lnsresea outube.com/watch?v=GEx_d0SjvS0.
rch.com/iiot-smart-connected-assets-and-the-pulp-p
Xin Tang, Yaqiong Lv, Lei Tu, C.K.M Lee (2018).
aper-industry.
IoT based omni-channel logistics service in indus-
S K Miller (2001). Aspect-oriented programming takes aim try 4.0. International Conference on Service Operations
at software complexity. Computer 34(4): 18-21, April and Logistics, and Informatics (SOLI 2018) IEEE, doi:
2001.doi: 10.1109/2.917532. 10.1109/SOLI.2018.8476708.
Mind + Machines (2017). Meet a digital twin. Online: https: TP (2019). techopedia. Online: https://www.techopedia.
//www.ge.com/digital/industrial-internet/digital-tw com/definition/24398/mesh-networking.
in.
Watson (2016.: Cognitive manufacturing with watson iot,
Mosch Christian (2020). RAMI 4.0 and Industrie 4.0 com-
October 2016, Online: https://www.youtube.com/wat
ponents, Online https://industrie40.vdma.org/en/vie
ch?v=AijIAPUh6ao.
wer/-/v2article/render/15557415.
Watson (2017). Watson internet of things, overview of end-
Nokia (2017). Introducing 5G with NOKIA Next Step to
to-end asset management. Online: https://www.ibm.c
Automation Online: https://www.youtube.com/wat
om/internet-of-things/iot-solutions/asset-managemen
ch?v=GoJOZOnJaMc.
t/.
OpenFog (2018). Why we need fog? Online: https://www
Wiedau Michael, Lars von Wdel, Heiner Temmen, Richard
.openfogconsortium.org/resources/.
Welke, Nikolas Ppakonstantinou (2018). ENPRO data
O’Really C, Tushman M (2013). Organizational ambidex- integration: extending DEXPI towards the asset life-
terity: past, present, future. The Academy of Management cycle. Chemie Ingenieur Technik 2019 91(3):240-255, doi:
Perspective 2013 No. 4:324-338. 10.1002/cite.201800112.
Alexander Osterwalder (2012), Achieve Product-Market
Wu D, Terpenny J, Zhang L, Gao R, Kurfess T
Fit with our Brand-New Value Proposition Designer.
(2016). Fog-enabled architecture for data-driven cyber-
Andra Ovans (2015), What we know, now, about the in- manufacturing systems. ASME. International Manufac-
ternet’s disruptive power. Harvard Business Review 2015 turing Science and Engineering Conference Volume 2: Ma-
https://hbr.org/2015/01/what-we-know-now-about-t terials; Biomanufacturing; Properties, Applications and
he-internets-disruptive-power. Systems; Sustainable Manufacturing.
Andrea Ovans, (2015a), What is a business model? Harvard Yle (2016). OneCoinia ei näy vieläkään, mutta rikas-
Business Review Jan 2015. tuneita pitäisi olla tuhatmäärin. Yle puoli seitsemän 2016
Porter Michael E.(2008), The Five Competitive Forces That [3:27/4:46 min], Available Online: https://yle.fi/aihe/a
Shape Strategy Harvard Business Review January 2008. rtikkeli/2016/03/11/onecoinia-ei-nay-vielakaan-mut
Michael E Porter, James E Heppelmann (2014). How smart, ta-rikastuneita-pitaisi-olla-tuhatmaarin.
connected products are transforming competition. Har-
vard Business review Nov 2014. Ulla Gain is a teacher at the University of Eastern Finland.
Raj P, and Raman, A C (2017). The Internet of Things: En- Ulla has previously worked in the energy industry for over
abling technologies, platforms, and use cases CRC Press. 20 years. She has developed and maintained product de-
pp.217-247. sign tools and new technological expertise for the power
plant business. She has a master’s degree in Economics
Luis Ruiz-Garcia, Loredana Lunadei, Pilar Barreiro, Jose
from the University of Jyväskylä and is a postgraduate
Ignacio Robla (2009). Review of Wireless Sensor Tech-
student at the University of Eastern Finland School of In-
nologies and Applications in Agriculture and Food In-
formation Technology.
dustry: State of the Art and Current Trends, Sensors 2009
9(6):4728-4750; https://doi.org/10.3390/s90604728.
ULLA GAIN

There is an ongoing research gap in assessing


possibilities concerning data manipulation
technologies and tools. For example, what
benefits they offer to manifest insights from
data, what they can automate and are there
any, how they affect human behaviour. The
thesis frames the cognitively computed
insights and proposes proofs of concept by
cognitive services. Cognitive services can be
integrated into solutions of different kinds,
and they are strongly related to human
cognitive functions.

uef.fi

PUBLICATIONS OF
THE UNIVERSITY OF EASTERN FINLAND
Dissertations in Forestry and Natural Sciences

ISBN 978-952-61-4461-0
ISSN 1798-5668

You might also like