Computers 10 00156

computers
Article
Smart Master Production Schedule for the Supply Chain:
A Conceptual Framework
Julio C. Serrano-Ruiz * , Josefa Mula and Raúl Poler
Research Centre on Production Management and Engineering (CIGIP), Universitat Politècnica de València
Escuela Politécnica Superior de Alcoy, C/Alarcón 1, Alcoy, 03801 Alicante, Spain; [email protected] (J.M.);
[email protected] (R.P.)
* Correspondence: [email protected]
Abstract: Risks arising from the effect of disruptions and unsustainable practices constantly push
the supply chain to uncompetitive positions. A smart production planning and control process
must successfully address both risks by reducing them, thereby strengthening supply chain (SC)
resilience and its ability to survive in the long term. On the one hand, the antidisruptive potential
and the inherent sustainability implications of the zero-defect manufacturing (ZDM) management
model should be highlighted. On the other hand, the digitization and virtualization of processes by
Industry 4.0 (I4.0) digital technologies, namely digital twin (DT) technology, enable new simulation
and optimization methods, especially in combination with machine learning (ML) procedures. This
paper reviews the state of the art and proposes a ZDM strategy-based conceptual framework that
models, optimizes and simulates the master production schedule (MPS) problem to maximize service
levels in SCs. This conceptual framework will serve as a starting point for developing new MPS
optimization models and algorithms in supply chain 4.0 (SC4.0) environments.

Citation: Serrano-Ruiz, J.C.; Mula, J.; Keywords: supply chain 4.0; master production schedule; zero-defect manufacturing; digital twin;
Poler, R. Smart Master Production machine learning
Schedule for the Supply Chain: A
Conceptual Framework. Computers
2021, 10, 156. https://doi.org/
10.3390/computers10120156 1. Introduction
Since artificial intelligence began to make its way into almost all the sectors of today’s
Academic Editors: Pedro Pereira,
society, the adjectives intelligent or smart have become commonplace to describe a myriad
Luis Gomes and João Goes
of entities which are, in one way or another, endowed with the ability to react to changes
in the environment to establish optimal operating conditions by themselves. We can find
Received: 19 October 2021
Accepted: 19 November 2021
some examples in the industrial sector, such as intelligent software, intelligent systems,
Published: 23 November 2021
and intelligent agents, or smart grids, smart sensors, smart products, among others. For a
supply chain 4.0 (SC4.0), understood as the supply chain (SC) that is reorganized by using
Publisher’s Note: MDPI stays neutral
the design principles and enabling technologies of the Industry 4.0 (I4.0) spectrum [1], it
with regard to jurisdictional claims in
seems appropriate to link the intelligent or smart attributes with SC abilities to overcome the
published maps and institutional affil- risks that it faces and survive as the main proof of its capability to respond to challenging
iations. changes in the environment and to achieve optimal operating conditions. Along these
lines, and regardless of whether causes are natural, economic, political or technological,
disruption is the most significant risk that an SC faces in the short and mid terms. On a
long-term horizon, lack of sustainability is one of the main risks for SC survival. So an
Copyright: © 2021 by the authors.
SC4.0, such as a smart SC, should be resilient and sustainable.
Licensee MDPI, Basel, Switzerland.
The effect of technological advances on industrial companies is indeed remarkable,
This article is an open access article
and guides their development toward a production paradigm in which resilience and
distributed under the terms and sustainability emerge as decisive SC management elements for both the future occupa-
conditions of the Creative Commons tion of better market positions and survival purposes [2]. SCs’ digital transformation,
Attribution (CC BY) license (https:// experimented on its way toward SC4.0, can contribute to addressing those aspects that
creativecommons.org/licenses/by/ compromise resilience and sustainability from a more favorable position by using I4.0
4.0/). design principles and enabling technologies to mitigate the complexity and heterogeneity
Computers 2021, 10, 156. https://doi.org/10.3390/computers10120156 https://www.mdpi.com/journal/computers

Computers 2021, 10, 156 2 of 24
imposed by the flow of materials, products and information through a means such as an SC,
which consists of multiple stages and nodes that are largely discrete and isolated from one
another [3]. Furthermore, the implementation of I4.0 principles and technologies to drive
SCs toward SC4.0 can be considered a process of progressive improvement and a transfor-
mation of SCs by managing the introduction of new technologies, and socio-environmental
dimensions, using sustainability and resilience as the core of improvement [4]. Indeed,
apart from the purely paradigmatic perspective of I4.0, two other relevant aspects can be
distinguished in SC4.0: technological and sustainability-related implications. In I4.0, new
information and communication technologies are mainly considered to address the rising
complexity that represents current industrial contexts that are, therefore, directly related
to digital transformation. SC4.0 also aims to integrate humans and the environment into
future industrial systems, which implies sustainable transformation [5].
Among other possibilities in line with this, one way to digitally and sustainably
transform SCs by placing an eye on the I4.0 paradigm is to promote the automation of SC
production systems and processes as well as their ability to respond in real time to changes
in environment, rapidly reacting to unforeseen situations and, thus, accomplishing their
assignments [6]. It should be stated that this approach is twofold as: (i) this capability can
be improved by intervening in the design of the involved systems and processes themselves
by an endogenous or enabling strategy, but (ii) it can also be enhanced by acting on the
environment and its potential to alter those systems and processes by an exogenous strat-
egy that aims to mitigate the stochasticity of the milieu, which could also be described as
antidisturbing [7]. The application of the aforementioned approach to different SC areas re-
quires particularizing its implementation. In the production planning and control area that
requirement is, in turn, extensible to its three decision-making levels; i.e., strategic, tactical
and operational levels. Specifically at the tactical decision level, the procedures employed
in the master production schedule (MPS) are also susceptible to be improved by applying
I4.0 enabling technologies and other management models to enhance SC resilience and
sustainability. According to the Association for Supply Chain Management (APICS) [8], the
MPS is an issue determined by the SC planning environment according to four important
component elements: (i) its strategies (make to stock, make to order, engineer to order,
configure to order, and assemble to order, among others.); (ii) the number and type of
involved stakeholders (suppliers, warehousers, manufacturers, distributors, retailers); (iii)
structure (hierarchy with its tiers and relations); (iv) the nature of activities (production,
distribution and/or procurement). A transformation approach of MPS procedures that
pursues higher levels of SC resilience and sustainability should, therefore, take into account
its strategies, stakeholders, structure and activities. In this complex multiagent context,
the digital twin (DT) [6,9–13] potential to simulate, optimize, predict, share and visualize
data in real time is significant, and can be helpful to collaboratively assist the MPS from
the above-described endogenous perspective, whereas zero-defect manufacturing (ZDM)
can be used to support the exogenous one as it allows failures and defects during the
production process to be minimized, mitigated or eliminated according to the “do things
right the first time” philosophy [14–18]. ZDM can also be seen as a new standard from the
sustainability perspective because addressing the minimization, mitigation or elimination
of defective products and processes implies posing specific waste management [17,19].
This integral approach intends to address the MPS in this research.
An additional difficulty to address the MPS arises from the required computational
efficiency. At the tactical level, the MPS needs a temporo-spatial disintegration of cumula-
tive planning targets and forecasts, along with the provision and forecasting of required
resources. This procedure eventually becomes difficult and slows down as the number
of considered resources, products and time periods increases [20–22] because feasible
solutions exponentially increases space in relation to a growing number of nodes (elements
containing a product, period and resource) in the system, which defines it as an NP-hard
problem. Most classic modeling approaches (simulation methods, heuristics, metaheuris-
tics, matheuristics) present computational limitations as the MPS problem dimension
Computers 2021, 10, 156 3 of 24
grows, particularly if the MPS is posed as a multi-objective issue. These limitations can
lead to unacceptable computational times for a decision support system (DSS) when this is
expected to facilitate real-time decision making, especially when the intention is to provide
it with a certain level of autonomy, just as the approach set out in this research calls for. The
machine learning (ML) potential to tackle this situation is remarkable at any production
planning and control decision level [23,24] and its application to the MPS problem should,
therefore, be considered. Furthermore, the feasibility to formulate the MPS problem as
a Markov decision process (MDP) [25–27] leads to the specific choice of reinforcement
learning as the main suitable candidate among existing ML methodologies.
Thus the combined use in the MPS process of (i) the DT enabling technology, (ii) the
ZDM management model, and (iii) ML-based modeling approaches is particularly relevant
because it can guide SC toward positions of greater resilience and sustainability and, for
this reason, can be qualified as a smart approach by providing 3-fold and complementary
assistance to SCs’ responsiveness to changes in their environment. Nevertheless, it must
be stated that this joint perspective of the smart improvement in the MPS problem has
not yet been addressed by the academic community as only one author in the currently
existing literature provides a simple initial conceptual framework that coincides with the
joint approach herein indicated.
This paper presents an overview of the addressed topics from a joint perspective and,
to bridge this knowledge gap, proposes an initial ML-based DT framework for automated
MPS management in an SC4.0 context with a zero-defect characteristic, which we call smart
MPS, to provide an answer to the following research questions:
- RQ1: What mechanisms can make the DT competent in assisting the MPS process
from an enabling strategy?
- RQ2: How can ML techniques help to overcome the difficulties that arise from the
MPS problem’s computational efficiency?
- RQ3: How does the ZDM anti-disturbing strategy push MPS to achieve a more
resilient and sustainable SC?
- RQ4: Can the DT technology, the ZDM management model and ML-based modelling
approaches be considered conceptual complementary tools that support MPS and
push it to higher resilience and sustainability levels?
The rest of the paper is organized as follows. Section 2 first provides definitions of the
main concepts included in this research and subsequently offers an overview of the related
literature. Section 3 describes the conceptual proposal by defining an initial framework
and presenting the setup of a smart MPS. Section 4 discusses the main implications of the
proposal formulated by reviewing how the proposal responds to the research questions.
Finally, Section 5 provides the main conclusions and further research.
2. Literature Review
The review of the selected literature was carried out in four stages: (i) a semantic
introduction to the main involved concepts; (ii) a literature search; (iii) a thematic approach
of the joint domain making up the selected literature; (iv) a content analysis to identify the
main contributions from the perspective of this research.
2.1. The Main Involved Concepts

The introductory definitions of the main concepts employed by means of this research
are provided in Table 1.
Computers 2021, 10, 156 4 of 24
Table 1. Definitions of the main concepts.
Concept Definitions
I4.0 stands for the fourth industrial revolution, which is
defined as a new level of organization and control over
the entire value chain of products’ life cycle. It is geared to
Industry 4.0 increasingly individualized customer requirements [28]. A
(Enabling context) combination of digital technology with manufacturing
transforms industrial production to the next level [29] the
convergence of industrial production, information and
communication technologies [30].
A transformational holistic approach to SC management
that utilizes I4.0 disruptive technologies to streamline SC
processes, activities and relations to generate significant
strategic benefits for all the SC stakeholders [31]. SC4.0 is
Supply chain 4.0
the SC created as a result of the new digital era brought
(Target context)
forth by the fourth industrial revolution [32], I4.0. The
reorganization of SCs–design and planning, production,
distribution, consumption, reverse logistics–using
technologies known as I4.0 [1].
A line on the master schedule grid that reflects the
anticipated built schedule of those items assigned to it,
and one that represents the items that a company plans to
produce and are expressed as specific configurations,
quantities and dates [8]. The MPS is essential for
Master production schedule maintaining customer service levels and stabilizing
(Research object) production planning in a material requirements planning
(MRP) environment [33]. The MPS drives the MRP system
and provides an important link between the forecasting,
order entry, and production planning activities on the one
hand, and the detailed planning and scheduling of
components and raw materials on the other hand [34].
A dynamic model in the virtual world that is fully
consistent with its corresponding physical entity in the
real world and can simulate its physical counterpart’s
characteristics, behavior, life, and performance in a timely
Digital twin fashion [35]. A virtual model in the virtual space that is
(Research tool) used to simulate the behavior and characteristics of the
corresponding physical object in real time [36]. A virtual
and computerized counterpart of a physical system that
can exploit the real-time synchronization of the sensed
data from the field and is closely linked with I4.0 [37].
A computer program capable of learning from experience
to improve a performance measure of a given task [38].
ML is an evolving branch of computational algorithms,
Machine learning designed to emulate human intelligence by learning from
(Research tool) the surrounding environment [39]. ML is an artificial
intelligence application that provides computers with the
ability to automatically learn and improve from
experience with no direct programming [40].
A strategy whose goal is to decrease and mitigate failures
in manufacturing processes and to do things right the first
time [41]. A manufacturing strategy which, by assuming
Zero-defect manufacturing that errors and failures will always exist, focuses on
(Research tool) minimising and detecting them online so that no
production output deviates from specification advances to
the next step [16]. ZDM consists of four strategies:
detection, repair, prediction, prevention [42].
Computers 2021, 10, 156 5 of 24
Table 1. Cont.
Concept Definitions
The attribute that defines an artificial system’s behavior
which, if a human behaves in the same way, is considered
intelligent [43]. Intelligence assists decision making by
converting raw business data into valuable and
meaningful information and knowledge [44], and is
Intelligence
supported by the development of advanced analytics and
(I4.0 design principle)
data visualization models, platforms and services that
support decision-making processes [45]. Intelligence is a
corporate capability to forecast change, regardless of it
coming in the form of opportunity or threat, and in time to
do something about it [46].
A set of conditions, qualities and abilities that allows a
device or system to correctly perform a function when
interacting with a real-world physical process that shares
the same temporal constraints. In the SC context, this
capability characterizes the way in which a given SC
Real-time action ability
device or system successfully performs its function within
(I4.0 design principle)
the time frame that configures the process with which it
interacts without altering the pace of its progress. This
capability is one of the main concerns in an SC as it allows
to speed up the elicitation of responses during decision
making and, consequently, increases its efficiency [47].
Resilience is an SC’s capacity to persist, adapt or
transform when faced with change from both engineering
and social-ecological perspectives [48]. An SC’s adaptive
capability is to prepare for and/or respond to disruptions,
to make a timely and cost-effective recovery and to,
Supply chain resilience therefore, progress to a post-disruption state of operations,
(Expected effect) ideally a better state than that before the disruption [49].
SC resilience is the adaptive capability to prepare for
unexpected events, respond to disruptions, and recover
from them by maintaining the continuity of operations at
the desired level of connectedness and control over both
structure and function [50].
SC sustainability is the management of environmental,
social and economic impacts, and the encouragement of
good governance practices, throughout the life cycles of
goods and services [51]. The extent to which the SC
Supply chain sustainability organization’s decisions impact the future situation of the
(Expected effect) natural environment, society and business viability [52]. A
sustainable SC is one that includes measures of profit and
loss, as well as social and environmental dimensions. Such
conceptualization has been referred to as the sustainability
triple dimension: financial, social, environmental [53].
2.2. Literature Search

The SC is a conceptual realm that has been approached from many different angles
with more than 50,000 entries in Scopus in the past decade alone. In an attempt to identify
all those trends, Maryniak et al. [54] diagnose which the dominant SC topic areas are in the
last three decades. However, hardly any literature has been identified that simultaneously
addresses the MPS problem in the SC from the ZDM perspective and with the joint support
of DT and ML technologies. Thus, in the Scopus database, the search instance TITLE-ABS-
KEY ((“supply chain” OR “supply network”) AND (“master production” OR mps) AND
(zdm OR “zero * defect”) AND “digital twin” AND (“machine learning” OR “artificial
intelligence”)) returned only one result, which evidences a knowledge gap. For this reason,
we further explored the existing literature in the individual knowledge domains MPS,
Computers 2021, 10, 156 6 of 24
ZDM, DT and ML, applied specifically to the SC, which added 24 relevant papers to the
aforementioned one (Table 2).
Table 2. Relevant literature about MPS, DT, ML, and ZDM applied specifically to the SC.
Author Tittle
Solving a Multi-Objective Master Planning Problem with
1 Chern et al., 2014 [55] Substitution and a Recycling Process for a Capacitated
Multi-Commodity Supply Chain Network
Application of Particle Swarm Optimisation with
2 Grillo et al., 2015 [56] Backward Calculation to Solve a Fuzzy Multi-Objective
Supply Chain Master Planning Model
Sutthibutr and Applied Fuzzy Multi-Objective with α-Cut Analysis for
3
Chiadamrong, 2019 [57] Optimizing Supply Chain Master Planning Problem
Integrated Material-Financial Supply Chain Master
4 Arani and Torabi, 2018 [58]
Planning under Mixed Uncertainty
Robust Master Planning of a Socially Responsible Supply
5 Ghasemy et al., 2020 [59] Chain under Fuzzy-Stochastic Uncertainty (A Case Study
of Clothing Industry)
Master Production Schedule Using Robust Optimization
6 Martin et al., 2020 [60]
Approaches in an Automobile Second-Tier Supplier
Fuzzy Multi-Objective Optimisation for Master Planning in
7 Peidro et al., 2012 [61]
a Ceramic Supply Chain
Digital Twin for Supply Chain Master Planning in
8 Serrano et al., 2021b [18]
Zero-Defect Manufacturing
The Use of Agent-Based Models Boosted by Digital Twins
9 Orozco-Romero et al., 2020 [62]
in the Supply Chain: A Literature Review
Digital Twins in Supply Chain Management: A Brief
10 Marmolejo-Saucedo et al., 2020 [12]
Literature Review
11 Barykin et al., 2020 [63] Concept for a Supply Chain Digital Twin
Digital Supply Chain Twins: Managing the Ripple Effect,
12 Ivanov et al., 2019 [13] Resilience, and Disruption Risks by Data-Driven
Optimization, Simulation, and Visibility
Coronavirus (COVID-19/SARS-CoV-2) and Supply Chain
13 Ivanov and Das, 2020 [64]
Resilience: A Research Note
14 Dolgui et al., 2020 [65] Reconfigurable Supply Chain: The X-Network
The Architectural Framework of a Cyber Physical Logistics
15 Park et al., 2021 [66]
System for Digital-Twin-Based Supply Chain Control
16 Wang et al., 2020 [10] Digital Twin-Driven Supply Chain Planning
Deep Reinforcement Learning and Optimization Approach
17 Alves and Mateus, 2020 [67]
for Multi-Echelon Supply Chain with Uncertain Demands
Deep Reinforcement Learning Approach for Capacitated
18 Peng et al., 2019 [68]
Supply Chain Optimization under Demand Uncertainty
Deep reinforcement learning for inventory control: A road
19 Boute et al., 2021 [69]
map.
A Deep Reinforcement Learning Approach for Optimal
20 Afridi et al., 2020 [70] Replenishment Policy in A Vendor Managed Inventory
Setting For Semiconductors
Adaptive supply chain: Demand–supply synchronization
21 Kegenbekov and Jackson, 2021 [71]
using deep reinforcement learning
Integrating Lean Six Sigma and Supply Chain Approach
22 Siddh et al., 2014 [72]
for Quality and Business Performance
A framework for the Impact of Lean Six Sigma on Supply
23 Pardamean and Wibisono, 2019 [73]
Chain Performance in Manufacturing Companies
Computers 2021, 10, 156 7 of 24
Table 2. Cont.
Author Tittle
Poornachandrika and Quality Transformation to Improve Customer Satisfaction:
24
Venkatasudhakar, 2020 [74] Using Product, Process, System and Behavior Model
Change Management for Sustainability: Evaluating the
25 Thakur and Mangla, 2019 [75] Role of Human, Operational and Technological Factors in
Leading Indian Firms in Home Appliances Sector
Given the special relevance of some of the involved concepts, such as: (i) I4.0, for repre-
senting the enabling context; (ii) SC4.0, for representing the target context; (iii) intelligence
and real-time action ability, for constituting the main I4.0 design principles implied in the
proposed research line; (iv) resilience and sustainability, for being the ultimate expected
effects of applying the DT-ML-ZDM scheme in the MPS, the review of these 25 papers has
also considered the treatment given to all these concepts.
2.3. Thematic Analysis

The thematic analysis of the selected papers, carried out with the VOSviewer 1.6.16
tool, shows (Figure 1) a first grouping of concepts around the main one, “supply chain”,
which is connected to four other groupings: “digital twin”, “production plan”, “digital
technology” and “supply chain management”. From the thematic map, and based on the
co-occurrences in the text composed of the title and the abstract of each paper, we observe
that: (i) the main group formed by “supply chain”, in addition to “organization”, is formed
by “ZDM”, “sustainability” and “uncertainty”; (ii) the most closely related concept to
“supply chain” is “digital twin”, which might reveal the importance that this technology
has acquired in the SC field; (iii) “digital twin” and “supply chain planning” form a cluster,
which shows the importance that the DT has in academia for SC planning processes; (iv) the
“production plan” cluster is also formed by “CPS” (cyber-physical systems) and “agent”,
which can place them as common tools for researchers in production planning; (v) the
cluster headed by “digital technology” includes concepts such as “quality”, “simulation”,
“ripple effect” and “resilience”, and this relation can show where digital technology draws
useful attention or generates interest in the SC domain; (vi) the cluster headed by “supply
chain management” also integrates “knowledge” and “future research”, which might be
related to the shown interest in acquiring new knowledge into the SC domain that supports
improvements to its management processes.
2.4. Content Analysis

Proposing new resolution and optimization models has been a recurrent approach in
the specific SC literature segment that focuses on the MPS. Chern et al. [55] put forward a
multi-objective MPS resolution model with a heuristic method based on a genetic algorithm
called the GA-based master planning algorithm (GAMPA) to solve a MPS problem with
multiple final products, substitutions and a recycling process with a stochastic pattern,
which creates a loop in both the SC and product structure trees. Grillo et al. [56] use
the fuzzy set theory to model uncertainty and propose a metaheuristic particle swarm
optimization (PSO) technique as a solution method. A method to achieve an optimal MPS in
an uncertain environment is that proposed by Sutthibutr and Chiadamrong [57]. It is based
on a multi-objective linear fuzzy model with an α-cut analysis to ensure decision makers.
The result satisfies their preferences based on a specified minimum allowed satisfaction
value (α). Arani and Torabi [58] integrate physico-material tactical plans with financial ones
to account for their reciprocal effects in a bi-objective mixed possibilistic-stochastic model
for an SC master planning problem. Ghasemy et al. [59] propose a mixed integer nonlinear
programming model with probabilistic constraints to determine centralized planning,
viewed from the sustainability perspective under uncertainty. Here the sustainability
aspect is reduced to a sustainable procurement planning addressed by appropriate supplier
selection. Martin et al. [60] address the uncertain MPS problem for an automotive second-
Computers 2021, 10, 156 8 of 24
tier supplier with two optimization approaches based on other authors’ research. Both
were tested in a real automotive SC and compared to a deterministic approach. The MPS
problem for a centralized SC of replenishment, production and distribution is tackled
Computers 2021, 10, x FOR PEER REVIEW 8 ofby
25
Peidro et al. [61], who present a fuzzy multi-objective linear programming approach to
model it.
Figure 1. Thematic
Figure 1. Thematic map.
map.
SerranoAnalysis
2.4. Content et al. [18] propose an initial DT-based conceptual framework to model and
simulate the MPS problem with a ZDM feature in the SC4.0 context. This is the only paper
Proposing new resolution and optimization models has been a recurrent approach in
in the literature to address the focus of this research comprehensively, albeit with an initial
the specific SC literature
descriptive approach. Thissegment that focuses
framework focuseson onthe MPS. Chern
creating et al. [55]
an enabling putfor
space forward
solvinga
optimization algorithms for the MPS problem based on applying deep reinforcementgenetic
multi-objective MPS resolution model with a heuristic method based on a learn-
algorithm
ing called the GA-based
(DRL) techniques. The framework master planning to
is designed algorithm
accommodate (GAMPA)
the settoofsolve
actorsainMPS the
problem
SC, along with multiple
with their finaland
physical products, substitutions
virtual processes and a recycling
and resources process manner.
in a collaborative with a
stochastic
Its design pattern, which creates
aims to improve a loop in both
SC performance bythe SC and product
reinforcing structure intelligence,
the digitization, trees. Grillo
et al. [56] use the fuzzy set theory to model uncertainty and propose
visibility, interconnectedness, organization and sustainability I4.0 attributes. This initial a metaheuristic
particle swarm
framework optimization
is restricted to the (PSO) technique
manufacturer andasgoes
a solution method. Asuppliers
up to second-tier method to to achieve
narrow
an optimal MPS
down the problem’s scope. in an uncertain environment is that proposed by Sutthibutr and
Chiadamrong [57]. It is based on a multi-objective linear fuzzy model
According to Orozco-Romero et al. [62], the DT technology is a tool that enables both with an α-cut
analysis to
real-time ensure
digital decision and
monitoring makers. The result
automatic decision satisfies
making. their preferences
Therefore, based
DTs are on a
relevant
specified
tools whenminimum
pursuing allowed
the goal of satisfaction
automating value (𝛼). Arani
SC systems. and Torabi [58] et
Marmolejo-Saucedo integrate
al. [12]
physico-material
review the scientifictactical planson
literature with
DTsfinancial
as one ofonesthe to
mainaccount for their technologies
I4.0 enabling reciprocal effects
withinin
a bi-objective mixed possibilistic-stochastic model for an SC master
the SC management realm. The association of DTs with SC visibility, and the possibility planning problem.
Ghasemy
of planningetand al. making
[59] propose
real-timea mixed
decisions,integer
lead nonlinear programming
to better disruptive model with
risk management
probabilistic
and constraints
higher resilience levels.to Along
determine centralized
such lines, Barykin et planning, viewed the
al. [63] attribute fromneedtheto
sustainability
build DTs given perspective
SCs’ poorunder uncertainty.
reliability Here due
and stability the sustainability
to errors in theiraspect is reduced
operation. They to
a sustainable
assert that DTs procurement
can generateplanning addressed
information on theby appropriate
impact of suchsupplier
errors, andselection. Martin
can influence
et al.
SC [60] addressby
performance theobserving
uncertaindifferent
MPS problem for an
scenarios thatautomotive
simulate thesecond-tier
location supplier
of errors with
and
two optimization approaches based on other authors’ research. Both were tested in a real
automotive SC and compared to a deterministic approach. The MPS problem for a
centralized SC of replenishment, production and distribution is tackled by Peidro et al.
[61], who present a fuzzy multi-objective linear programming approach to model it.
Serrano et al. [18] propose an initial DT-based conceptual framework to model and
Computers 2021, 10, 156 9 of 24
their duration, and to analyse recovery policies. All this leads to greater SC resilience.
Ivanov et al. [13] explain the SC DT concept and propose a framework for risk management
by analyzing perspectives and future transformations that can help to integrate resilience
owing to the information provided by the DT. According to the authors’ paper, an SC
DT is a model that can represent the network state for any given moment in time, and
allows for complete end-to-end SC visibility to improve resilience and to test contingency
plans, which is clearly aligned with the approach of this research by focusing on resilience
and sustainability. The research by Ivanov and Das [64] is centred on SC resilience after
disruptive events occurring as a result of the COVID-19 pandemic and how to optimally
recover normalcy in an SC. It identifies the need to implement such a partnership to map
supply networks and to ensure their visibility as a tool to recover from disruption, where
the DT can play a significant role by taking the disruptive effect of the pandemic as an
example. Dolgui et al. [65] propose reconfigurability as an SC parameter that characterizes
the SC in an uncertain and changing environment. It does so by addressing the notion
of a reconfigurable SC, or a X-network, by taking the DT as a basis for its design. In a
reconfigurable SC, the organization design at the network level must be shaped by I4.0,
circular economy, industrial symbiosis and collaborative industry. In SCs, reconfigurability
plays an important role in I4.0 design principles, such as intelligence, real-time action
capability, flexibility and sustainability (the last of which comes in its three well-known
dimensions), as well as enabling technologies such as the DT, in this specific case as SC
DTs which, according to the authors, are computerized models representing the network
state for any given moment in real time. SCs’ resilience to fluctuations in make-to-order
SC environments in customized production cases is addressed by Park et al. [66], whose
propose a logistics CPS, or CPLS, coordinated with agent cyber physical production systems
(CPPS) in a multi-level cyber-physical system structure based on distributed DT simulation
technology. Wang et al. [10] address the SC problem from a DT perspective by detailing its
benefits and potential compared to other approaches: (i) with synchronization between the
physical and virtual twin, the DT promotes faster action and response to reduce lead times;
(ii) with dynamic and comprehensive data collection, the DT improves forecast accuracy;
(iii) with high-quality modeling, the DT significantly improves planning verifications. Thus
in the I4.0 and SC4.0 eras, the DT promotes demand forecasting, aggregate planning and
inventory planning to be more analytical, reliable, efficient and quick to obtain, which all
favor SC resilience.
As for using ML techniques to support production planning and control problems
in the SC domain, it is worth noting that most contributions focus on the operational
decision level. Of those dealing with planning at the tactical decision level, most focus on
either inventory replenishment or, to a lesser extent, dynamic supplier selection problems.
Alves and Mateus [67] consider a DRL approach based on an improved version of the
proximal policy optimization algorithm (PPO), called PPO2, to solve the inventory problem
of a four-step SC with two nodes per step and stochastic demands. The optimization
approach for Peng et al. [68] is similar, but the modeled problem considers a simpler SC
composed of three stages—plant, plant warehouse and retailer—subject to independent,
stochastic and seasonal demand. In it the adequate and stable supply of raw materials
is assumed, but the plant’s production capacity is limited. The article of Boute et al. [69]
offers a conceptual approach, and its objective is to describe the key design choices of
DRL algorithms to facilitate their implementation into the inventory control task in SCs.
It first introduces MDPs for inventory control optimization in their different solution
approaches. Second, it describes the use of neural networks to solve MDPs, as well as the
different methods that arise according to how the function of Bellman equations is used for
the neural network design. After these theoretical introductions, the authors explain the
procedure followed to develop DRL algorithms by providing a taxonomic analysis. The
research by Afridi et al. [70] focuses on the environments of certain complex SCs, such as
those in the semiconductor industry, where innovation cycles are short, production lead
times are long and demand uncertainty is high. These operating conditions in SCs mean
Computers 2021, 10, 156 10 of 24
that semiconductor manufacturers are particularly exposed to the undesired amplification

of demand fluctuations within the chain, a phenomenon known as the bullwhip effect,
which was described by Lee et al. [76]. In this context, the authors propose adopting
a collaborative strategy known as vendor management inventory (VMI), in which the
supplier takes control and full responsibility for replenishing the customer’s inventory
by defining minimum and maximum inventory levels, and all supported by the deep
Q-network (DQN) method. The authors consider a two-stage SC and model this problem
as an MDP. Synchronization of SCs as a means to avoid the bullwhip effect in stochastic
environments constitutes the central theme of the research by Kegenbekov and Jackson [71].
Indeed, an SC with synchronized stages and nodes can prevent the dynamics of cascading
inventory increases and decreases that follow unanticipated fluctuations in demand, and
to mitigate the bullwhip effect caused by operational errors. A DRL agent can perform
the adaptive coordination needed to perform such synchronization, as long as end-to-end
visibility in the SC is complete. As an MDP, the authors model a problem characterized
by having a single-product, multi-stage, single-node-per-step SC environment in which a
PPO agent has to choose how many products to order from all the SC agents in each step
to, thus indirectly obtain local inventory levels.
The application of the ZDM philosophy to the SC domain has also been a topic
addressed by researchers, albeit sparsely. Most focus more on the quality management
discipline than on production planning and control, and the zero-defect outcome comes
about from indirectly applying other strategies or philosophies. For Siddh et al. [72], the
objective is to integrate lean six sigma into SCs instead of ZDM, but the zero-defect outcome
is indirectly achieved as an effect. Within the lean six sigma framework, the authors place
a central idea: knowing how many defects the process has, systematically figuring out
how to eliminate them is possible. This research does not address resilient SC properties
and, as the authors state, the only mention made to the sustainability issue is through
the 5S of lean manufacturing: sort, store, shine, standardize, sustain. Pardamean and
Wibisono [73] propose a framework to explain the impact of six sigma on SC performance
based on increasing process capability in the value stream by seeking zero defects and
reducing process variation, which approximates to the aforementioned exogenous strategy
to mitigate the milieu stochasticity, also described as an antidisturbing strategy, to thus
favor the automation of SC production systems and processes, and the capability to re-
spond in real time to changes in the environment. In this research, sustainable SCs’ logistics
performance is assessed using three categories, namely sustainable supplier selection, sus-
tainable production and sustainable delivery. Poornachandrika and Venkatasudhakar [74]
present a behavioral process and a system model for achieving zero defects with a case
study conducted in an automotive company. This article focuses mainly on the transforma-
tion of quality within SCs. One of its main conclusions is that the elimination of human
intervention in some processes improves results, which relates it to automation. Unlike the
above authors, Thakur and Mangla [75] understand the zero-defects concept in the SC as
one of the final effects of sustainable practices.
Finally, it is worth noting that the relevant literature on MPS, DT, ML and ZDM
applied specifically to the SC shows a common thread that should be highlighted here.
Most articles present research results that, in one way or another, and to a greater or lesser
extent, are based on some of the design principles and enabling technologies of I4.0 and,
therefore, of SC4.0, from a positive perspective of both paradigms; in other words, from a
position that assumes, as a valid axiom, that introducing I4.0 and SC4.0 into a context such
as that of SCs only leads to positive effects. Some researchers argue that this is not really
the case. Adopting I4.0 and SC4.0, in addition to opportunities, involves barriers and poses
risks [77] that must be duly considered when addressing any digital transformation project
in the SC, and which will depend largely on the selected digitization strategy and the core
capabilities acquired by the SC by that strategy [78].
From the review, it can be concluded: (i) the existing literature on the MPS problem
addressing the DT, ML and ZDM individually is abundant and varied, but the literature that
Computers 2021, 10, 156 11 of 24
addresses the problem from a joint perspective is practically nonexistent; (ii) DT technology
is considered by researchers an enabling tool to achieve higher efficiency and reliability
levels by endowing SC systems with capabilities, such as decision-making automation,
real-time response, end-to-end visibility or disruptive risk management; (iii) conceptual
framework or model proposals based on the DRL-driven DT are very limited; (iv) using
ML methods to support production planning in the SC domain is also a limited practice
that centers mostly on DRL-based methods; (v) of all the DRL-based methods followed
by the researchers in the SC planning domain, PPO implementations have become more
prominent in the last 3 years, followed by DQN algorithms, whose use is currently declining
in favor of PPO and its variants, as previously indicated; (vi) the ZDM issue in the SC
domain is still not approached as a per se strategy, but appears as an effect of applying other
strategies, such as lean manufacturing, six sigma, or their merger lean six sigma, despite the
remarkable and growing interest shown by researchers in applying ZDM to other planning
domains, especially at the operational decision level. Only a couple of authors mention the
potential of this strategy for the mitigation of disturbances that affect processes, which is
so exploited in other planning contexts, especially in operational decision terms such as job
scheduling and sequencing.
3. Proposal
The proposal of a conceptual framework for the smart MPS based on the DT-ML-ZDM
scheme is formulated in the following five stages in this section: (i) alignment axes of the
proposal with the I4.0 and SC4.0 paradigms; (ii) integrating the DT for the MPS into the SC
context; (iii) integrating the physical and virtual environments of the DRL-based DT; (iv)
description of the DRL-based agent’s learning and prescription processes; and, finally, (v)
the proposal summary.
3.1. Alignment Axes of the Proposal with I4.0 and SC4.0

The proposal presented here is based on the general assumption that the environment
on which it is developed has the characteristics of an SC4.0, i.e., an SC whose digital
transformation is aligned with the design principles governing I4.0 and is carried out by
using its enabling technologies. In this particular case, specifically some design principles
of I4.0, such as flexibility, intelligence, integration, virtualization, interconnectedness, inter-
operability, visibility, real-time action ability, energy efficiency and sustainability [79–82]
play a relevant role directly or indirectly in the endeavor to confront SC complexity and
heterogeneity toward more resilience and sustainability on the way toward SC4.0. The
same applies for some of its enabling technologies, such as information and communication
technologies (ICT), cyber-physical systems (CPS) or cyber-physical production systems
(CPPS), the Internet of things (IoT) or the industrial IoT (IIoT), smart enterprise resource
planning (ERP), manufacturing execution system (MES), virtual reality, DT, ML algorithms,
big data, cloud services or cloud manufacturing, semantic technologies and cybersecu-
rity [79,81,83–85], which are involved in the design of the proposal, along with techniques
such as modeling, simulation and optimization.
3.2. Integrating the DT into the SC Context

Within the conceptual framework that is herein proposed, the DT is firstly character-
ized by virtually replicating the MPS, an operation also known as digital twinning [86].
Based partially on the research by [13,63,66], the proposed DT shapes the MPS as
two different planes, the physical plane and the virtual one, as shown in Figure 2. In the
physical DT plane, the MPS is determined by physical processes and resources, meaning
data and information on the processes and resources from the actual SC environment. The
main physical processes that determine the MPS are: (i) demand forecasting; (ii) receiving
customer orders; (iii) planning processing; (iv) formalizing the intervening parties’ commit-
ment to the MPS; (v) referring to suppliers about the MPS; (vi) controlling MPS evolution.
As for the involved physical resources, the MPS is determined by: (i) manpower; (ii) pro-
Based partially on the research by [13,63,66], the proposed DT shapes the MPS as two
different planes, the physical plane and the virtual one, as shown in Figure 2. In the
physical DT plane, the MPS is determined by physical processes and resources, meaning
data and information on the processes and resources from the actual SC environment. The
main physical processes that determine the MPS are: (i) demand forecasting; (ii) receiving
Computers 2021, 10, 156 12 of 24
customer orders; (iii) planning processing; (iv) formalizing the intervening parties’
commitment to the MPS; (v) referring to suppliers about the MPS; (vi) controlling MPS
evolution. As for the involved physical resources, the MPS is determined by: (i)
manpower;
ductive (ii) productive
equipment; equipment;
(iii) inventory; (iii)production;
(iv) started inventory;(v)(iv) started production;
subcontracted quantities;(v)(vi)
subcontracted
capacity quantities;
constraints; and (vi)
(vii)capacity
time as constraints; and (vii) time
a resource represented byasthe
a resource
differentrepresented
milestones
by the different
shaping milestones
and constraining the shaping
problem.andThe constraining the problem. The
sources and communication sources
systems and
of these
communication
data and informationsystems
can of these
vastly data
vary byand information
taking canthe
into account vastly vary by taking
environment into
in question,
account the environment
characterized by the I4.0 and in question, characterized
SC4.0 paradigms: by the sensorization,
CPS/CPPS, I4.0 and SC4.0the paradigms:
IoT/IIoT,
CPS/CPPS, sensorization, the IoT/IIoT, cloud manufacturing, smart
cloud manufacturing, smart ERP, MES, among other I4.0 enabling technologies. The ERP, MES, among
data
other I4.0 enabling technologies. The data and information fed to the
and information fed to the DT from any SC node must be automated and its real-time flowDT from any SC
nodebe
must must be automated and its real-time flow must be guaranteed.
guaranteed.
Figure2.2.Integrating
Figure Integratingthe
theDT
DT into
into SC
SC context.
context. Figure
Figure based on Serrano
based on Serrano et
et al.
al. [87].
[87].
In order to perform the analysis, simulation, optimization and prescription, the data
In order to perform the analysis, simulation, optimization and prescription, the data
and information from the physical SC environment must be replicated and processed
and information from the physical SC environment must be replicated and processed
virtually at two different levels: the backend or support level; the frontend or interface.
The backend forms part of the DT development and is responsible for running the existing
system logic behind the interface with the human operator. In the backend, the processes
and resources data and information from the physical plane are translated into virtual
processes and resources. The virtual processes that enable DT functioning in the backend
are: (i) simulation in the virtual environment for agent training, based on historical data
or the generation of synthetic scenarios; (ii) agent training in the virtual environment, a
parallel and simultaneous process to the previous one; and (iii) agent prediction, herein
called prescription, a process enabled by the successful completion of the training process.
By virtual backend resources, we mean both the data and information related to the real
plane elements, as well as those related to the formulation and modeling of the MPS
problem, but they are all coded and combined in such a way that they can feed the above
simulation, training and prescription processes based on the DRL method. This includes
data and information from: (i) the MDP model of the MPS and the DRL model; (ii) demand;
(iii) costs; (iv) lotification; (v) capacity; (vi) deadlines and periods; and (vii) possible
policies. From these backend processes, data and information, the frontend, as an interface
specially prepared to human users, automatically provides in real time the schedule that
is currently prescribed by the agent and the necessary information about using resources.
This information can also automatically feed other tactical decision level processes, such as
MRP, inventory control or capacity requirements planning (CRP).
The DT backend, and the MPS data and information contained therein, are elements
that, in principle, belong to the manufacturer’s sphere and are not replicated for other
SC stakeholders, i.e., suppliers, warehousers, retailers or, in some cases of customized
manufacturing, even customers. Unlike the previous one, the DT frontend, whose repli-
Computers 2021, 10, 156 13 of 24
cation scope extends beyond the manufacturer’s sphere (the centre of the SC within this
framework), is shared with other SC stakeholders in a collaborative cloud-computing envi-
ronment to provide end-to-end visibility to each SC actor and the possibility of real-time
process synchronization to achieve: (i) greater SC enablement against unexpected demand
fluctuations, which make it more resilient; and (ii) optimized use of resources by enabling
inventory reduction, improved transportation efficiency, reduced energy use, a shorter lead
time to, thus, lower costs, among other effects, that result in greater sustainability.
Within this framework, the SC is understood as a single domain for all the intervening
SC stakeholders, where each one uses personalized data and information blocks about
the MPS with different access categories according to their particular needs, but all from a
single common origin: the DT. This scheme not only facilitates the flow of data and infor-
mation about production planning among actors, but also creates a coordination channel
for the zero-defect strategy in the SC as it makes it possible to: (i) enable collaborative
manufacturing with the DT as a means of sharing data and information about processes
and resources; (ii) for each involved stakeholder, monitor the MPS process parameters
that need to be shared in this collaborative manufacturing context to improve early defect
detection, or even prediction, as a way to empower prevention policies and to, thus, better
cope with disturbing or disruptive events and their subsequent recovery; (iii) enhance
data storage, analysis and visualization by unifying these performances through the DT;
(iv) quickly reconfigure and reorganize the MPS whenever necessary in a coordinated
manner by gaining efficiency and saving idle times for this reason; and (v) collaboratively
launch real-time production rescheduling across the entire SC, which is generated and
spread by the DT. In a nutshell: (i) collaborative manufacturing; (ii) process monitor-
ing; (iii) data management enhancement; (iv) reconfiguration and reorganization; and
(v) real-time rescheduling ability, i.e., five of the seven system areas—which also include
continuous quality control and online predictive maintenance—formulated by Lindström
et al. [16] in their model for ZDM would be collected and considered within this framework
to favor a zero-defect goal in the SC and, in this specific manner, to understand MPS
processes and fight against process failures to minimize, mitigate and eliminate possible
disturbances that can potentially place the SC’s normal operation at risk and lead to higher
resilience levels.
The implementation of the DT for the SC smart MPS according to the described
framework would require several stages to be extended throughout the chain as a whole.
Nevertheless, the first and most important one is to develop the manufacturer’s specific
domain, where the backend is located as the core of the DT, before extending it to the
scope of the other actors involved in the SC. The basic infrastructure and processes in this
restricted DT space are described in the two following subsections of the proposal.
3.3. Integrating the Physical and Virtual Environments of the DRL-Based DT

The DRL-based DT is configured within this framework as a set of overlapping
and interrelated layers, where each individual layer demarcates a defined part of the DT
environment (Figure 3). This setup is partially based on the research by Serrano et al. [88]
for a smart DT for ZDM-based job-shop scheduling.
All these DT layers or elements act as a receiver, processor and/or generator of data
and information, depending on the characteristics of the role played in the DT. The physical
environment of the DT groups the following five elements: (i) the hardware and software
making up the DT frontend interface and backend processing core; (ii) the hardware and
software for storing the dataset in the cloud; (iii) the IIoT; (iv) cyber-physical systems (CPS)
distributed throughout the SC physical environment; and (v) information captured locally
on the current state of production and resources that is relevant to the MPS. Regarding the
virtual environment, it groups the following elements: (i) demand forecasts and the current
status of customer ordering, dynamically updated in real time; (ii) the DRL agent; (iii)
the master scheduling policy; (iv) the simulation environment for agent training; (v) the
accumulated training data; and (vi) the set of actions taken by the agent on the MPS.
restricted DT space are described in the two following subsections of the proposal.
3.3. Integrating the Physical and Virtual Environments of the DRL-Based DT

The DRL-based DT is configured within this framework as a set of overlapping and
Computers 2021, 10, 156 interrelated layers, where each individual layer demarcates a defined part of the 14 ofDT
24
environment (Figure 3). This setup is partially based on the research by Serrano et al. [88]
for a smart DT for ZDM-based job-shop scheduling.
Figure3.3.Integration
Figure Integrationof
ofthe
thephysical
physicaland
andvirtual
virtualenvironments
environmentsofofthe
theDRL-based
DRL-basedDT.
DT.
All
Allthese
theseelements
DT layersare
orsynchronized and
elements act as constituteprocessor
a receiver, a single cohesive environment
and/or generator in
of data
the
andDT.
information, depending on the characteristics of the role played in the DT. The
physical
3.4. environment
Description of the DTAgent’s
of the DRL-Based groupsLearning
the following five elements:
and Prescription (i) the hardware and
Processes
software making up the DT frontend interface and backend processing core; (ii) the
Both processes are based on the DRL method [69], and are basically developed by
hardware and software for storing the dataset in the cloud; (iii) the IIoT; (iv) cyber-
two elements, the training environment and the DRL agent (Figure 4), to be implemented
physical systems (CPS) distributed throughout the SC physical environment; and (v)
into a DRL framework based on the Python code with the help of its specialized open
information captured locally on the current state of production and resources that is
source libraries.
relevant to the MPS. Regarding the virtual environment, it groups the following elements:
The training environment is the MPS modeled as an MDP in such a way that it is
made up of: (i) an observation space; (ii) an action space; (iii) an initial state; (iv) the state
transition function. The observation space specifies which are the variables of the MPS
problem and delimits the boundaries between, which may vary each period. The action
space determines the variety of actions that can be decided about the MPS problem and to
what extent. The initial state represents the MPS state during the first period considered
in the MPS training cycle, and is defined by the value taken by the MPS variables in the
observation space during the initial period. Finally, the state transition function defines
what varies, and to what extent, between one state and the next after an agent action is
applied in the valid action space. This environment to be implemented into the Python
code with the Open AI Gym library is assisted by an ad hoc scenario generator, which can
create synthetics problem instances that are adequately modeled to facilitate agent training.
The training process can also be assisted with stored historical MPS data modeled as MDP
requirements if data are available and if deemed necessary or convenient.
the DT.
3.4. Description of the DRL-Based Agent’s Learning and Prescription Processes

Both processes are based on the DRL method [69], and are basically developed by
Computers 2021, 10, 156 15 of 24
two elements, the training environment and the DRL agent (Figure 4), to be implemented
into a DRL framework based on the Python code with the help of its specialized open
source libraries.
Figure4.4.Setup
Figure Setup of
of the
the DRL-based
DRL-basedDT-driven
DT-drivenMPS.
MPS.
InThe
thetraining
training environment
stage, the DRLis theagent
MPS must
modeledplayasitsanrole
MDP ininthesuch a way
arena that it
shaped byisthe
made up of: (i) an observation space; (ii) an action space; (iii) an initial
above-described environment. From the initial state prepared by the scenario generator, state; (iv) the state
transition
the function. The
agent essentially acts observation space specifies
in the environment which an
by triggering areadvance
the variables
toward of the
a newMPS state
problem and delimits the boundaries between, which may vary each
for the next period. The environment grants the agent a reward for this step, whose value period. The action
space determines
essentially dependsthe onvariety of actions
how much the newthatstate
can be decided the
improves about
MPS.theWith
MPS this
problem
reward andand
to what extent. The initial state represents the MPS state during the first
the new MPS state, the agent performs a new action that depends on the selected type of period considered
in the
DRL MPS training
method, cycle, and is
i.e., value-based, defined by the
policy-based, valuemethods
hybrid taken bysuch the MPS variables inamong
as actor-critics, the
observation space during the initial period. Finally, the state transition function defines
others, which lead to a new state and a new reward, and so on, period after period, to
what varies, and to what extent, between one state and the next after an agent action is
complete a planning cycle. These training cycles are repeatedly performed as often as
applied in the valid action space. This environment to be implemented into the Python
necessary until the agent’s throughput evaluation exceeds a certain threshold, or the DRL
code with the Open AI Gym library is assisted by an ad hoc scenario generator, which can
algorithm is changed by not exceeding the threshold after a predetermined number of
create synthetics problem instances that are adequately modeled to facilitate agent
cycles. Finally, when training is evaluated as satisfactory and the DRL agent is considered
training. The training process can also be assisted with stored historical MPS data
sufficiently trained, the latter is prepared to interact with the real environment—which,
modeled as MDP requirements if data are available and if deemed necessary or
unlike the training environment, is dynamic and continuous—and, from this, new MPS
convenient.
states In
arethe
prescribed.
training stage, the DRL agent must play its role in the arena shaped by the
The DRL
above-described agent can be a Python
environment. From algorithm to beprepared
the initial state implemented by thewith the RLLib
scenario via Ray
generator,
library and Tensorflow, specifically designed to interact in the above-described
the agent essentially acts in the environment by triggering an advance toward a new state training
environment with basically
for the next period. two operation
The environment grants modes: training
the agent and prescription.
a reward When value
for this step, whose training,
the
essentially depends on how much the new state improves the MPS. With this reward period,
DRL agent collects the current MPS state and predicts a new state for the next and
and
theso
newon,MPS
untilstate,
all the
theperiods of a complete
agent performs a newtraining MPS
action that cycle have
depends on thebeen completed.
selected type ofThe
agent’s predictions are based on a learned methodology from synthetic or real data, which
depend on the DRL methodology selected from those existing in the RLLib library and the
adjustment of its hyperparameters. This library, which includes the most basic versions,
those of the policy-based or value-based type, mainly collects the most usual hybrid-
based DRL methodologies, such as actor-critic or gradient-based methods; e.g., policy
gradients (PG), soft actor critics (SAC), advantage actor-critic (A2C, A3C), or proximal
policy optimization (PPO), and some high-performance architectures such as asynchronous
proximal policy optimization (APPO). DRL algorithm selection relies on an additional
module attached to the agent that evaluates its performance during training and has the
capacity to modify: (i) the agent’s number of training cycles, also called epochs; (ii) the
DRL algorithm type depending on the result of evaluations; and (iii) the adjustment of
certain basic hyperparameters that varies according to the selected DRL algorithm.
Computers 2021, 10, 156 16 of 24
3.5. Proposal Summary

In summary: (i) the proposed DT is conceived as a DSS implemented by the manufac-
turer and partially shared with suppliers, warehousers, retailers and, depending on the
case, customers by means of a cloud-computing system; (ii) from all these SC stakehold-
ers, the DT receives the data and information about the processes and resources that are
properly modeled as a DRL instance; (iii) when the DRL agent is trained, the DT processes
the MPS problem automatically and autonomously in real time based on the DRL method;
(iv) the DT provides a permanently optimized MPS in the event of any change in input as
output, but respects the committed ordering policy on the fixed demand horizon, if any;
(v) the DT allows the manufacturer to transmit changes to lower planning levels without
delays, such as MRP, CRP or inventory control; (vi) diverts a master supply schedule to
suppliers at their different tiers for their own planning; (vii) diverts available products to
promise per period to warehousers, retailers and, depending on the case, to customers; and
(viii) delimits the data and information of each actor depending on its role.
4. Discussion
The MPS plays a crucial role in the SC and has been a sustained driver of research into
new planning methodologies, which has provided continuous scientific development and
generated new models with a wide range of approaches. However, in today’s dynamic
environment, the growing scale and complexity of global SCs and the new technological
developments occurring at an ever-increasing speed mean that knowledge gaps persistently
appear. In the case at hand, the aim of this paper is to respond to the lack of contributions
detected in the literature on the joint use of the ZDM management model and the ML-based
DT enabling technology to pursue smart master planning to, thus, contribute to a resilient
and sustainable SC.
On the mechanisms that lead one of the I4.0 enabling technologies par excellence, such
as the DT, to constitute a competent tool to enable the MPS to achieve higher automation,
autonomy and real-time action capacity levels, it can be stated that the DT is a system
that combines physical entities—in our case, the data and information about the real mas-
ter planning environment—with their virtual counterparts—the virtual MPS—by taking
advantage of the benefits of virtual and physical environments to benefit the whole sys-
tem [11]. The DT captures information from the physical entity, which it stores, processes,
analyzes and evaluates so that the knowledge generated after these operations can be subse-
quently applied to not only current physical entities, but also to future ones [11], and all this
without localization restrictions given its ability to enable shared virtual spaces where data
and information about systems become more visible [12,13] and, thus, enable collaborative
production scenarios. Relating the implementation of this technology in the literature into
the digital transformation of processes from the perspective of its automation [89] and
its endowment with higher autonomy levels is commonplace [90]. Moreover, the DT’s
potential to enable real-time management is a recurrent research topic in the logistics and
industrial field in general [36], but also in the area of production planning and control in
particular [37], especially when assisted by artificial intelligence [7]. Not many examples
appear in the literature that show the benefits of the DT in the specific MPS field [18], but
they can be found in many other SC fields, such as real-time monitoring and control [62],
risk management [13], recovery from disruption [65], SCs’ resilience to disruption [66],
planning verifications related to demand forecasting, aggregate planning, and inventory
planning [10]. One limitation of this technology is that the existing commercial solutions
on the market currently have relatively high acquisition and maintenance costs, and need
to be handled by qualified personnel. However, the possibility of implementing ad hoc
solutions with open source tools has increased significantly since this technology began to
make its way in the early part of the last decade.
Regarding ML and its ability to cope with NP-hard computational complexity levels,
once again it is true that, in the production planning and control area, the academic commu-
nity has chosen to address mostly the application of ML methods in process problems other
Computers 2021, 10, 156 17 of 24
than the MPS, i.e., at the tactical decision level, mainly in inventory control and supplier
selection problems, and at the operational decision level, in the various configurations of
the job scheduling problem [18]. It is important to emphasize that these problems share
the possibility of being modeled as MDP with the MPS, which would a priori allow the
application of the reinforcement learning methodology with similar guarantees of success
in the MPS as in other problems. However, it must be assumed that the complex struc-
ture of current SCs, especially global ones with many stages and nodes, the number of
variables included in the modeled problem and its intrinsically stochastic condition imply
that the modeling of real cases with the reinforcement learning methodology, but without
the additional assistance of other methods, constitutes a considerable challenge. Only
through the gradual incorporation of the DRL methodology [69], a combination of the
reinforcement learning methodology with deep learning—another ML methodology that
uses artificial neural networks to transform a set of inputs into a set of outputs, that solve
tasks that involve handling complex and high-dimensional raw input data sets [91]—has it
been possible to begin to consider the study of SCs with certain complexity, e.g.,: (i) the
multistage SC problem of Alves and Mateus [67], validated with a four-stage SC scenario
and two nodes per stage, local inventories, lead time, a single product, and demand uncer-
tainty; (ii) the capacitated SC problem of Peng et al. [68], validated with a three-stage SC
scenario, one node in the first, two in the second and three in the last stage, capacitated
production, independent, stochastic and seasonal demand, and a single product; (iii) the
case of Meisheri et al. [92] who, despite restricting the validation of their retailers’ inventory
replenishment to the last SC layers, i.e., warehouse and retailer, considers the existence of
product variety, with instances of 100 and 220 products—to substantially increase combina-
torial computation—and incorporates lead time, limited storage capacity, cross-product
restrictions, and weight and volume transportation restrictions. Computational limitations
in this regard are manifested as the size of the problem to be solved in terms of the size
of the input dataset, and especially the size of the modeled problem’s observation space.
Nevertheless, advances in the DRL methodology are continuous and new implementations
with meta-learning, contextual bandits or high-performance architectures, among others,
frequently appear, whose application in the SC planning field is yet to be explored as the
most advanced implementations in the related literature do not go beyond gradient-based
methods, such as PPO, advantage actor-critic (A2C), or even the basic DQN.
The ZDM management model is often associated with I4.0 for presenting largely
compatible objectives and providing synergistic and complementary approaches [17,93,94].
Beyond the most well-known ZDM objectives, such as minimizing failures and defects
and their early online detection, this management model shares with I4.0 the purpose
of minimizing production costs and making production more efficient and sustainable
by reducing the number of failures, breakdowns and defective parts [95]. Although the
ZDM model does not fully appear in the literature about SCs and the MPS, it should
be emphasized that the effects of meeting its objectives entail certain benefits for SCs
whose discussion is of interest in the present research: (i) minimization of defects on line,
regardless of them being failures, breakdowns or defective quality parts, is a factor that, in
turn, favors the minimization of the disturbances that usually affect the system [7,15,16,96].
Thus it is beneficial action for the automation of processes; and (ii) the sustainability of
SCs is favored by the minimization and mitigation of defects in two of its dimensions,
economic and environmental, because achieving the ZDM strategy favors the reduction
of costs, but also the reduction of emissions and energy, and raw material use [17,94]. It
should also be noted that ZDM and resilience are related in the literature [97] as they
share some significant points. The path toward higher resilience levels in SC involves
promoting both properties that reduce the vulnerability of the SC to disruptive events and
those that reduce its recovery time. ZDM has the dual potential to improve both groups of
properties as manufacturing without failures or defects is robust, persistent and, therefore,
less vulnerable manufacturing, but also allows faster recovery after disruptions because
it is more agile and adaptable. As for the relation between ZDM and sustainability, it is
Computers 2021, 10, 156 18 of 24
remarkable how the research of Psarommatis et al. [19] establishes such a direct relation,
and the word sustainability plays a leading role in the very definition of ZDM provided in
this paper: “ZDM offers a holistic approach, aiming at greater manufacturing sustainability,
which ensures both process and product quality by reducing product defects through
the use of corrective, preventive, and predictive techniques made possible by data-based
technologies, and guarantees that no defective products leave the production site and
reach the customer”. However, the ZDM model has some restrictions that should be
mentioned. As a quality improvement (QI) method, ZDM differs from traditional methods,
such as lean manufacturing, six sigma or total quality management (TQM) because, while
traditional methods use historical data to improve the future without considering the
current production status, ZDM employs both historical and current data, essential for
tracing the cause of the defect and to learn from the event. This advantage of ZDM lies in
its negative counterpart insofar as it requires intensive real-time data use, without which
the model’s efficiency is compromised [19].
Thus it seems reasonable to think that the DT technology, the ML method and the
ZDM model applied in the MPS are aligned individually with a smart MPS model that
contributes to a more resilient and sustainable SC. However, this alignment is reinforced in
the triple combination of the DT-ML-ZDM scheme given the cross synergies among them,
where the following stand out: (i) the DT technology is favored by the ML method because
it enables the real-time prescription of solutions to the MPS problem in high-dimensional
problems, a field in which traditional methods such as analytics, simulation and heuristics
are limited; (ii) the DT technology is favored by the ZDM model because it mitigates
the disturbances number and magnitude on the system, which favors its automation,
including DT functions; (iii) the ML method is favored by the DT technology because
the virtualization of the real environment allows the ML agent to act on it only when it
is positively evaluated and is, therefore, able to prescribe after training, which confers
planning robustness; (iv) for the same reason, the ZDM model is favored by DT technology
because the fact that the ML agent only acts when it is trained favors the reduction and
mitigation of errors and, thus, the elimination of defects; and, finally, (v) the ZDM model is
favored by applying the ML method because the latter favors the necessary real-time data
feeding of the former and can, thus, properly carry out its function. The potential benefits
are, therefore, significant. Yet the other side of the coin is marked by the possible barriers
and risks associated with implementing the DT-ML-ZDM scheme, among which, and
according to the research of Müller et al. [77], those associated with are: (i) suppliers and
SC partners, e.g., their critical attitude toward changes, or rejection of data transparency;
(ii) organization and implementation, e.g., the amount of investment required, or lack of
resources or expertise; (iii) data management, e.g., data security, quality or availability;
(iv) human aspects; e.g., the role of new employees, labor market disruptions or critical
attitudes to change; (v) technology, e.g., its implementation procedure, overestimation of its
benefits, use of immature systems or poor selection; finally, (vi) legal issues and standards,
e.g., public framework conditions, standardization and business ethics. Digital integration
with customers is also an aspect to consider in this regard given the positive influence
it will have on SC management and performance, as indicated by Queiroz et al. [78].
Thus an effective MPS digital transformation process according to the herein proposed
scheme must take into account all these challenges and risks beyond the simple synergistic
implementation of DT-ML-ZDM.
That said, it is also worth mentioning that from a practical perspective, MPS robustness
lies on several fundamental pillars, of which the following are highlighted: (i) the accuracy
of demand forecasts; (ii) the consideration of realistic constraints and deadlines; (iii) the
use of accurate calculation methods; (iv) the flexibility to synchronize with the evolution
of demand patterns; (v) a fluid movement of information between agents and areas; and
(vi) the involved parties’ acceptance. Thus the existence of instruments that provide
this structure with support by facilitating SC systems and processes becoming visible
to the agents involved in it, in their different areas and at their distinct decision levels,
Computers 2021, 10, 156 19 of 24
collaborative interaction capacity, wide data access, simulation and off-line analysis power,
and real-time action capacity, can all be key to minimize, mitigate and/or eliminate failures
and defects, and to reinforce SC resilience and sustainability. Therefore, it is considered that
in the particular MPS context, virtualization by means of the DT, the intelligence imbued
in decision-making processes with ML assistance, and the stable fluency of processes in
line with the zero-defect philosophy have the capacity to play a significant role in the
smart MPS.
5. Conclusions
This paper proposes an initial DT-based conceptual framework to model, optimize
and prescribe the MPS in an SC in a ZDM context. This framework focuses on developing
optimization algorithms to solve the MPS problem in the specific described environment
based on digital twinning with the support of DRL techniques.
The proposed DT-based model, designed to accommodate the set of stakeholders in
the SC, along with their real and virtual processes and resources in two different planes, is
described. The DRL-based DT-driven MPS setup is also presented.
Both the described framework and its configuration are considered a first contribution
of this research. Its design aims to improve SC performance by reinforcing its digitization,
intelligence, visibility, interconnectedness and organization, which all take the SC toward
higher resilience and sustainability levels; a goal for any traditional SC that intends to be
transformed into SC4.0. The DT technology is distinguished by the potential to simulta-
neously and positively influence all these aspects because: (i) digitization is an intrinsic
property of a DT; (ii) although the commonest purpose of a DT is to simulate, analyze,
predict or optimize, this technology admits moving one step further toward the action
of autonomously prescribing, a capability to which the attribute of intelligence can be
attributed; (iii) a model in which the DT replicates a specific planning subject (e.g., the MPS)
for its shared use across the entire SC has the capacity to take visibility, interconnectedness
and organization qualities to a higher level; (iv) a more effective ZDM strategy facilitated
by the model design contributes to not only SC resilience by minimizing, mitigating or
eliminating potential disturbing factors, but also to more sustainability.
The reinforcement learning approach offers certain benefits that are highlighted.
Proper DRL-based modeling that bridges the exploration-exploitation dilemma in a bal-
anced manner can help to solve the problem of correlating immediate planning actions with
their long-term consequences. In addition, and unlike analytical or heuristic approaches,
the DRL-based modeling approach provides an acceptable solution in real-world envi-
ronments, such as manufacturing, for those problems in which feedback is often subject
to time delays, provided that these problems can be characterized as Markov decision
processes (MDP), which is the case of the MPS problem. It is also shown that the DRL
method is an effective tool for dealing with problems whose solution with analytical or
heuristic approaches is harder due to implicit computational complexity.
This proposal has some limitations. The model does not foresee the inclusion of
financial considerations. Moreover, the possibility of putting at risk with unwise decisions
the economic value of the resources involved in the MPS by the actors involved in the SC
means that it is advisable to restrict the DT’s prescriptive action in a first stage so that the
final MPS confirmation depends on the human operator. This recommendation would
continue to be advisable until the system’s reliability is properly verified.
Regarding research perspectives, this conceptual framework has to be considered an
initial starting point and roadmap for modeling, applications and empirical validations in
a real-world SC MPS case study. It is also necessary to study if the modeling approach can
be extended to other planning levels, such as MRP or inbound and outbound logistics, and
under what conditions this would be possible.
Although the proposed conceptual framework accommodates all the intervening
actors in the SC, developing the model beyond the manufacturer and its suppliers at the
two closest tiers is challenging and opens up a supplementary research line. The same
Computers 2021, 10, 156 20 of 24
conclusion is reached for the task of incorporating additional supplier tiers into the previous
two, plus logistics warehousers, wholesale distributors, retailers and, finally, customers.
A better understanding of the relevance of the human factor in SC4.0 and its planning
would also be a topic for further research. The European Union’s Industry 5.0 initiative falls
in line with this, and it is worth mentioning the desirability of further research into the role
that humans should play in environments where not only the most physically demanding
or risky tasks are being transferred from humans to systems, but also the responsibility for
decision making.
Lastly, the described conceptual framework, and the technical background behind the
proposed DT, can be adapted to other novel alternative tactical planning frameworks, such
as adaptive sales and operations planning (AS&OP) that derive from the demand-driven
adaptive enterprise (DDAE) model by substituting the MPS subject for other different ones;
e.g., replenishment of items in the buffers identified at the tactical level. Even by being
formulated as an MDP, it can also be modeled as a nonlinear, stochastic and/or fuzzy
problem to face uncertainty, which would be a promising future research line.
Author Contributions: Conceptualization, J.C.S.-R.; methodology, J.C.S.-R.; validation, J.M. and R.P.;
formal analysis, J.C.S.-R. and J.M.; investigation, J.C.S.-R.; writing-original draft preparation, J.C.S.-R.;
writing-review and editing, J.C.S.-R., J.M. and R.P.; funding acquisition, J.M. and R.P. All authors
have read and agreed to the published version of the manuscript.
Funding: The research leading to these results received funding from the European Union H2020
Program with grant agreements No. 825631 “Zero-Defect Manufacturing Platform (ZDMP)” and
No. 958205 “Industrial Data Services for Quality Control in Smart Manufacturing (i4Q)”, and from
Grant RTI2018-101344-B-I00 funded by MCIN/AEI/10.13039/501100011033 and by “ERDF A way of
making Europe”.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data has been presented in main text.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Ferrantino, M.J.; Koten, E.E. Understanding Supply Chain 4.0 and Its Potential Impact on Global Value Chains. Glob. Value Chain.
Dev. Rep. 2019, 2019, 103.
2. Marmolejo-Saucedo, J.; Hartmann, S. Trends in Digitization of the Supply Chain: A Brief Literature Review. EAI Endorsed Trans.
Energy Web 2020, 7, e8. [CrossRef]
3. Büyüközkan, G.; Göçer, F. Digital Supply Chain: Literature Review and a Proposed Framework for Future Research. Comput. Ind.
2018, 97, 157–177. [CrossRef]
4. Dossou, P.E. Impact of Sustainability on the Supply Chain 4.0 Performance. Procedia Manuf. 2018, 17, 452–459. [CrossRef]
5. Winkelhaus, S.; Grosse, E.H. Logistics 4.0: A Systematic Review towards a New Logistics System. Int. J. Prod. Res. 2020, 58, 18–43.
[CrossRef]
6. Feldt, J.; Kourouklis, T.; Kontny, H.; Wagenitz, A.; Teti, R.; D’Addona, D.M. Digital Twin: Revealing Potentials of Real-Time
Autonomous Decisions at a Manufacturing Company. Procedia CIRP 2020, 88, 185–190. [CrossRef]
7. Serrano-Ruiz, J.C.; Mula, J.; Poler, R. Smart Manufacturing Scheduling: A Literature Review. J. Manuf. Syst. 2021, 61, 265–287.
[CrossRef]
8. John, H.; Blackstone, P.C., Jr. (Eds.) Association for Supply Chain Management (APICS) APICS Dictionary, 14th ed.; APICS: Chicago,
IL, USA, 2014.
9. Liu, M.; Fang, S.; Dong, H.; Xu, C. Review of Digital Twin about Concepts, Technologies, and Industrial Applications. J. Manuf.
Syst. 2021, 58, 346–361. [CrossRef]
10. Wang, Y.; Wang, X.; Liu, A. Digital Twin-Driven Supply Chain Planning. Procedia CIRP 2020, 93, 198–203. [CrossRef]
11. Jones, D.; Snider, C.; Nassehi, A.; Yon, J.; Hicks, B. Characterising the Digital Twin: A Systematic Literature Review. CIRP J.
Manuf. Sci. Technol. 2020, 29, 36–52. [CrossRef]
12. Marmolejo-Saucedo, J.A.; Hurtado-Hernandez, M.; Suarez-Valdes, R. Digital Twins in Supply Chain Management: A Brief
Literature Review. In Proceedings of the Intelligent Computing and Optimization ICO 2020, Koh Samui, Thailand, 17–18
December 2020; Vasant, P., Zelinka, I., Weber, G.-W., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp.
653–661.
Computers 2021, 10, 156 21 of 24
13. Ivanov, D.; Dolgui, A.; Das, A.; Sokolov, B. Digital Supply Chain Twins: Managing the Ripple Effect, Resilience, and Disruption
Risks by Data-Driven Optimization, Simulation, and Visibility. In Handbook of Ripple Effects in the Supply Chain; Ivanov, D., Dolgui,
A., Sokolov, B., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 309–332. ISBN 978-3-030-14302-2.
14. Angione, G.; Cristalli, C.; Barbosa, J.; Leitão, P. Integration Challenges for the Deployment of a Multi-Stage Zero-Defect
Manufacturing Architecture. In Proceedings of the IEEE 17th International Conference on Industrial Informatics INDIN 2019,
Helsinki, Finland, 22–25 July 2019; Institute of Electrical and Electronics Engineers (IEEE): Piscataway Township, NJ, USA, 2019;
ISBN 9781728129273.
15. Psarommatis, F.; Kiritsis, D. A Hybrid Decision Support System for Automating Decision Making in the Event of Defects in the
Era of Zero Defect Manufacturing. J. Ind. Inf. Integr. 2021, 100263. [CrossRef]
16. Lindström, J.; Kyösti, P.; Birk, W.; Lejon, E. An Initial Model for Zero Defect Manufacturing. Appl. Sci. 2020, 10, 4570. [CrossRef]
17. Psarommatis, F.; May, G.; Dreyfus, P.A.; Kiritsis, D. Zero Defect Manufacturing: State-of-the-Art Review, Shortcomings and
Future Directions in Research. Int. J. Prod. Res. 2020, 58, 1–17. [CrossRef]
18. Serrano, J.C.; Mula, J.; Poler, R. Digital Twin for Supply Chain Master Planning in Zero-Defect Manufacturing BT—Technological
Innovation for Applied AI Systems; Camarinha-Matos, L.M., Ferreira, P., Brito, G., Eds.; Springer International Publishing: Cham,
Switzerland, 2021; pp. 102–111.
19. Psarommatis, F.; Sousa, J.; Mendonça, J.P.; Kiritsis, D. Zero-Defect Manufacturing the Approach for Higher Manufacturing
Sustainability in the Era of Industry 4.0: A Position Paper. Int. J. Prod. Res. 2021, 1–19. [CrossRef]
20. Bakar, M.R.A.; Abbas, I.T.; Kalal, M.A.; AlSattar, H.A.; Bakhayt, A.-G.K.; Kalaf, B.A. Solution for Multi-Objective Optimisation
Master Production Scheduling Problems Based on Swarm Intelligence Algorithms. J. Comput. Theor. Nanosci. 2017, 14, 5184–5194.
[CrossRef]
21. Zaidan, A.A.; Atiya, B.; Abu Bakar, M.R.; Zaidan, B.B. A New Hybrid Algorithm of Simulated Annealing and Simplex Downhill
for Solving Multiple-Objective Aggregate Production Planning on Fuzzy Environment. Neural Comput. Appl. 2019, 31, 1823–1834.
[CrossRef]
22. Wu, Z.-J.; Wang, W.; Zhou, J.; Ren, F.-F.; Zhang, C. Research on Double Objective Optimization of Master Production Schedule
Based on Ant Colony Algorithm. In Proceedings of the 2010 International Conference on Computational Intelligence and Security,
CIS 2010, Nanning, China, 11–14 December 2010; pp. 200–204.
23. Usuga Cadavid, J.P.; Lamouri, S.; Grabot, B.; Pellerin, R.; Fortin, A. Machine Learning Applied in Production Planning and
Control: A State-of-the-Art in the Era of Industry 4.0. J. Intell. Manuf. 2020, 31, 1531–1558. [CrossRef]
24. Cadavid, J.P.U.; Lamouri, S.; Grabot, B.; Fortin, A. Machine Learning in Production Planning and Control: A Review of Empirical
Literature. IFAC-PapersOnLine 2019, 52, 385–390. [CrossRef]
25. Dolgui, A.; Ould-Louly, M.-A. A model for supply planning under lead time uncertainty. Int. J. Prod. Econ. 2002, 78, 145–152.
[CrossRef]
26. Géhan, M.; Castanier, B.; Lemoine, D. Joint Optimization of a Master Production Schedule and a Preventive Maintenance Policy.
In Proceedings of the 2013 International Conference on Industrial Engineering and Systems Management (IESM), Agdal, Morocco,
28–30 October 2013; Institute of Electrical and Electronics Engineers (IEEE): Piscataway Township, NJ, USA, 2013; pp. 1–7, ISBN
978-2-9600532-4-1.
27. Lechuga, G.P.; Martínez, F.V.; Ramírez, E.P. Stochastic Optimization of Manufacture Systems by Using Markov Decision Processes.
In Handbook of Research on Modern Optimization Algorithms and Applications in Engineering and Economics; Vasant, P., Weber, G.,
Dieu, V.N., Eds.; IGI Global: Hershey, PA, USA, 2016; Chapter 7; pp. 185–208. [CrossRef]
28. Vaidya, S.; Ambad, P.; Bhosle, S. Industry 4.0—A Glimpse. Procedia Manuf. 2018, 20, 233–238. [CrossRef]
29. Kritzinger, W.; Karner, M.; Traar, G.; Henjes, J.; Sihn, W. Digital Twin in Manufacturing: A Categorical Literature Review and
Classification. IFAC-PapersOnLine 2018, 51, 1016–1022. [CrossRef]
30. Drath, R.; Horch, A. Industrie 4.0: Hit or Hype? [Industry Forum]. IEEE Ind. Electron. Mag. 2014, 8, 56–58. [CrossRef]
31. Frederico, G.F.; Garza-Reyes, J.A.; Anosike, A.; Kumar, V. Supply Chain 4.0: Concepts, Maturity and Research Agenda. Supply
Chain. Manag. An. Int. J. 2020, 25, 262–282. [CrossRef]
32. Zekhnini, K.; Cherrafi, A.; Bouhaddou, I.; Benghabrit, Y.; Garza-Reyes, J.A. Supply Chain Management 4.0: A Literature Review
and Research Framework. Benchmarking An. Int. J. 2021, 28, 465–501. [CrossRef]
33. Tang, O.; Grubbström, R.W. Planning and Replanning the Master Production Schedule under Demand Uncertainty. Int. J. Prod.
Econ. 2002, 78, 323–334. [CrossRef]
34. Zhao, X.; Xie, J.; Jiang, Q. Lot-sizing Rule and Freezing the Master Production Schedule under Capacity Constraint and
Deterministic Demand. Prod. Oper. Manag. 2001, 10, 45–67. [CrossRef]
35. Zhuang, C.; Liu, J.; Xiong, H. Digital Twin-Based Smart Production Management and Control Framework for the Complex
Product Assembly Shop-Floor. Int. J. Adv. Manuf. Technol. 2018, 96, 1149–1163. [CrossRef]
36. Bao, J.; Guo, D.; Li, J.; Zhang, J. The Modelling and Operations for the Digital Twin in the Context of Manufacturing. Enterp. Inf.
Syst. 2019, 13, 534–556. [CrossRef]
37. Negri, E.; Fumagalli, L.; Macchi, M. A Review of the Roles of Digital Twin in CPS-Based Production Systems. Procedia Manuf.
2017, 11, 939–948. [CrossRef]
38. Mitchell, T.M. Machine Learning; The McGraw-Hill Companies: New York, NY, USA, 1997; Volume 2.
Computers 2021, 10, 156 22 of 24
39. El Naqa, I.; Murphy, M.J. What Is Machine Learning? In Machine Learning in Radiation Oncology: Theory and Applications; El Naqa,
I., Li, R., Murphy, M.J., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 3–11. ISBN 978-3-319-18305-3.
40. Stupar, S.; Bičo Ćar, M.; Kurtović, E.; Vico, G. The Importance of Machine Learning in Intelligent Systems. In Proceedings of the
New Technologies, Development and Application IV, Sarajevo, Bosnia and Herzegovina, 24–26 June 2021; Karabegović, I., Ed.;
Springer International Publishing: Cham, Switzerland, 2021; pp. 638–646.
41. Halpin, J.F. Zero Defects: A New Dimension in Quality Assurance; McGraw-Hill: New York, NY, USA, 1966.
42. Psarommatis, F.; Kiritsis, D.; Kiritsis, D.; Moon, I.; Park, J.; von Cieminski, G.; Lee, G.M. A Scheduling Tool for Achieving Zero
Defect Manufacturing (ZDM): A Conceptual Framework. IFIP Adv. Inf. Commun. Technol. 2018, 536, 271–278.
43. Simmons, A.B.; Chappell, S.G. Artificial Intelligence-Definition and Practice. IEEE J. Ocean. Eng. 1988, 13, 14–42. [CrossRef]
44. Ghadge, A.; Er Kara, M.; Moradlou, H.; Goswami, M. The Impact of Industry 4.0 Implementation on Supply Chains. J. Manuf.
Technol. Manag. 2020, 31, 669–686. [CrossRef]
45. Horváth, D.; Szabó, R.Z. Driving Forces and Barriers of Industry 4.0: Do Multinational and Small and Medium-Sized Companies
Have Equal Opportunities? Technol. Forecast. Soc. Chang. 2019, 146, 119–132. [CrossRef]
46. Breakspear, A. A New Definition of Intelligence. Intell. Natl. Secur. 2013, 28, 678–693. [CrossRef]
47. Rezaei, M.; Akbarpour Shirazi, M.; Karimi, B. IoT-Based Framework for Performance Measurement. Ind. Manag. Data Syst. 2017,
117, 688–712. [CrossRef]
48. Wieland, A.; Durach, C.F. Two Perspectives on Supply Chain Resilience. J. Bus. Logist. 2021, 42, 315–322. [CrossRef]
49. Tukamuhabwa, B.R.; Stevenson, M.; Busby, J.; Zorzini, M. Supply Chain Resilience: Definition, Review and Theoretical Founda-
tions for Further Study. Int. J. Prod. Res. 2015, 53, 5592–5623. [CrossRef]
50. Ponomarov, S.Y.; Holcomb, M.C. Understanding the Concept of Supply Chain Resilience. Int. J. Logist. Manag. 2009, 20, 124–143.
[CrossRef]
51. Sisco, C.; Chorn, B.; Pruzan-Jorgensen, P.M. Supply Chain Sustainability. A Practical Guide for Continuous Improvement; United
Nations Global Compact and BSR: New York, NY, USA, 2015.
52. Giannakis, M.; Papadopoulos, T. Supply Chain Sustainability: A Risk Management Approach. Int. J. Prod. Econ. 2016, 171,
455–470. [CrossRef]
53. Sustainable Supply Chains. Models, Methods, and Public Policy Implications. In International Series in Operations Research &
Management Science; Boone, T., Jayaraman, V., Ganeshan, R., Eds.; Springer: Cham, Switzerland, 2012; Volume 174.
54. Maryniak, A.; Bulhakova, Y.; Lewoniewski, W.; Bal, M. Diffusion of Knowledge in the Supply Chain over Thirty Years—Thematic
Areas and Sources of Publications. In Proceedings of the Information and Software Technologies ICIST 2020, Kaunas, Lithuania,
15–17 October 2020; Lopata, A., Butkienė, R., Gudonienė, D., Sukackė, V., Eds.; Springer International Publishing: Cham,
Switzerland, 2020; pp. 113–126.
55. Chern, C.-C.; Lei, S.-T.; Huang, K.-L. Solving a Multi-Objective Master Planning Problem with Substitution and a Recycling
Process for a Capacitated Multi-Commodity Supply Chain Network. J. Intell. Manuf. 2014, 25, 1–25. [CrossRef]
56. Grillo, H.; Peidro, D.; Alemany, M.M.E.; Mula, J. Application of Particle Swarm Optimisation with Backward Calculation to Solve
a Fuzzy Multi-Objective Supply Chain Master Planning Model. Int. J. Bio-Inspired Comput. 2015, 7, 157–169. [CrossRef]
57. Sutthibutr, N.; Chiadamrong, N. Applied Fuzzy Multi-Objective with α-Cut Analysis for Optimizing Supply Chain Master
Planning Problem. In Proceedings of the 2019 International Conference on Management Science and Industrial Engineering,
Phuket, Thailand, 24–26 May 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 84–91.
58. Arani, H.V.; Torabi, S.A. Integrated Material-Financial Supply Chain Master Planning under Mixed Uncertainty. Inf. Sci. 2018,
423, 96–114. [CrossRef]
59. Ghasemy Yaghin, R.; Sarlak, P.; Ghareaghaji, A.A. Robust Master Planning of a Socially Responsible Supply Chain under
Fuzzy-Stochastic Uncertainty (A Case Study of Clothing Industry). Eng. Appl. Artif. Intell. 2020, 94, 103715. [CrossRef]
60. Martín, A.G.; Díaz-Madroñero, M.; Mula, J. Master Production Schedule Using Robust Optimization Approaches in an Automobile
Second-Tier Supplier. Cent. Eur. J. Oper. Res. 2020, 28, 143–166. [CrossRef]
61. Peidro, D.; Mula, J.; Alemany, M.M.E.; Lario, F.-C. Fuzzy Multi-Objective Optimisation for Master Planning in a Ceramic Supply
Chain. Int. J. Prod. Res. 2012, 50, 3011–3020. [CrossRef]
62. Orozco-Romero, A.; Arias-Portela, C.Y.; Saucedo, J.A.M. The Use of Agent-Based Models Boosted by Digital Twins in the Supply
Chain: A Literature Review. In Proceedings of the Intelligent Computing and Optimization, ICO 2020, Koh Samui, Thailand,
17–18 December 2020; Vasant, P., Zelinka, I., Weber, G.-W., Eds.; Springer International Publishing: Cham, Switzerland, 2020;
pp. 642–652.
63. Barykin, S.Y.; Bochkarev, A.A.; Kalinina, O.V.; Yadykin, V.K. Concept for a Supply Chain Digital Twin. Int. J. Math. Eng. Manag.
Sci. 2020, 5, 1498–1515. [CrossRef]
64. Ivanov, D.; Das, A. Coronavirus (COVID-19/SARS-CoV-2) and Supply Chain Resilience: A Research Note. Int. J. Integr. Supply
Manag. 2020, 13, 90–102. [CrossRef]
65. Dolgui, A.; Ivanov, D.; Sokolov, B. Reconfigurable Supply Chain: The X-Network. Int. J. Prod. Res. 2020, 58, 4138–4163. [CrossRef]
66. Park, K.T.; Son, Y.H.; Noh, S. do The Architectural Framework of a Cyber Physical Logistics System for Digital-Twin-Based
Supply Chain Control. Int. J. Prod. Res. 2021, 59, 5721–5742. [CrossRef]
Computers 2021, 10, 156 23 of 24
67. Alves, J.C.; Mateus, G.R. Deep Reinforcement Learning and Optimization Approach for Multi-Echelon Supply Chain with
Uncertain Demands. In Proceedings of the Computational Logistics ICCL 2020, Enschede, The Netherlands, 28–30 September
2020; Lalla-Ruiz, E., Mes, M., Voß, S., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 584–599.
68. Peng, Z.; Zhang, Y.; Feng, Y.; Zhang, T.; Wu, Z.; Su, H. Deep Reinforcement Learning Approach for Capacitated Supply Chain
Optimization under Demand Uncertainty. In Proceedings of the 2019 Chinese Automation Congress, CAC 2019, Hangzhou,
China, 22–24 November 2019; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2019; pp. 3512–3517.
69. Boute, R.N.; Gijsbrechts, J.; van Jaarsveld, W.; Vanvuchelen, N. Deep Reinforcement Learning for Inventory Control: A Roadmap.
Eur. J. Oper. Res. 2021. [CrossRef]
70. Tariq Afridi, M.; Nieto-Isaza, S.; Ehm, H.; Ponsignon, T.; Hamed, A. A Deep Reinforcement Learning Approach for Optimal
Replenishment Policy in A Vendor Managed Inventory Setting for Semiconductors. In Proceedings of the 2020 Winter Simulation
Conference, WSC 2020, Orlando, FL, USA, 14–18 December 2020; Bae, K.-H., Feng, B., Kim, S., Lazarova-Molnar, S., Zheng, Z.,
Roeder, T., Thiesing, R., Eds.; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2020; pp. 1753–1764.
[CrossRef]
71. Kegenbekov, Z.; Jackson, I. Adaptive Supply Chain: Demand-Supply Synchronization Using Deep Reinforcement Learning.
Algorithms 2021, 14, 240. [CrossRef]
72. Siddh, M.M.; Soni, G.; Gadekar, G.; Jain, R. Integrating Lean Six Sigma and Supply Chain Approach for Quality and Business
Performance. In Proceedings of the 2014 2nd International Conference on Business and Information Management (ICBIM),
Durgapur, India, 9–11 January 2014; pp. 53–57.
73. Pardamean Gultom, G.D.; Wibisono, E. A Framework for the Impact of Lean Six Sigma on Supply Chain Performance in
Manufacturing Companies. IOP Conf. Ser. Mater. Sci. Eng. 2019, 528, 012089. [CrossRef]
74. Poornachandrika, V.; Venkatasudhakar, M. Quality Transformation to Improve Customer Satisfaction: Using Product, Process,
System and Behaviour Model. IOP Conf. Ser. Mater. Sci. Eng. 2020, 923, 012034. [CrossRef]
75. Thakur, V.; Mangla, S.K. Change Management for Sustainability: Evaluating the Role of Human, Operational and Technological
Factors in Leading Indian Firms in Home Appliances Sector. J. Clean. Prod. 2019, 213, 847–862. [CrossRef]
76. Lee, H.L.; Padmanabhan, V.; Whang, S. The Bullwhip Effect in Supply Chains. Sloan Manag. Rev. 1997, 38, 93–102. [CrossRef]
77. Müller, J.M.; Schmidt, M.-C.; Rücker, M.; Veile, J.W.; Birkel, H.; Voigt, K.-I. Pitfalls, Sticks and Stones: Understanding Challenges
Industry 4.0 Poses For Inter-Company Logistics. In Proceedings of the International Symposium on Logistics (ISL 2021), Seoul,
Korea, 12–13 July 2021; pp. 153–161.
78. Queiroz, M.M.; Pereira, S.C.F.; Telles, R.; Machado, M.C. Industry 4.0 and Digital Supply Chain Capabilities. Benchmarking Int. J.
2021, 28, 1761–1782. [CrossRef]
79. Cañas, H.; Mula, J.; Díaz-Madroñero, M.; Campuzano-Bolarín, F. Implementing Industry 4.0 Principles. Comput. Ind. Eng. 2021,
158, 107379. [CrossRef]
80. Hermann, M.; Pentek, T.; Otto, B. Design Principles for Industrie 4.0 Scenarios. In Proceedings of the 2016 49th Hawaii
International Conference on System Sciences (HICSS), Koloa, HI, USA, 5–8 January 2016; pp. 3928–3937.
81. Nosalska, K.; Piatek,
˛ Z.M.; Mazurek, G.; Rzadca,
˛ R. Industry 4.0: Coherent Definition Framework with Technological and
Organizational Interdependencies. J. Manuf. Technol. Manag. 2020, 31, 837–862. [CrossRef]
82. Ghobakhloo, M. The Future of Manufacturing Industry: A Strategic Roadmap toward Industry 4.0. J. Manuf. Technol. Manag.
2018, 29, 910–936. [CrossRef]
83. Ivanov, D.; Tang, C.S.; Dolgui, A.; Battini, D.; Das, A. Researchers’ Perspectives on Industry 4.0: Multi-Disciplinary Analysis and
Opportunities for Operations Management. Int. J. Prod. Res. 2021, 59, 2055–2078. [CrossRef]
84. Habib, M.K.; Chimsom, C. Industry 4.0: Sustainability and Design Principles. In Proceedings of the 2019 20th International
Conference on Research and Education in Mechatronics (REM), Wels, Austria, 23–24 May 2019; pp. 1–8. [CrossRef]
85. Chiarello, F.; Trivelli, L.; Bonaccorsi, A.; Fantoni, G. Extracting and Mapping Industry 4.0 Technologies Using Wikipedia. Comput.
Ind. 2018, 100, 244–257. [CrossRef]
86. Rathore, M.M.; Shah, S.A.; Shukla, D.; Bentafat, E.; Bakiras, S. The Role of AI, Machine Learning, and Big Data in Digital Twinning:
A Systematic Literature Review, Challenges, and Opportunities. IEEE Access 2021, 9, 32030–32052. [CrossRef]
87. Serrano-Ruiz, J.C.; Mula, J.; Poler Escoto, R. A metamodel for digital planning in the supply chain 4.0. J. Ind. Inf. Integr.
Under review.
88. Serrano-Ruiz, J.C.; Mula, J.; Poler Escoto, R. Smart Digital Twin for ZDM-Based Job-Shop Scheduling. In Proceedings of the 2021
IEEE International Workshop on Metrology for Industry 4.0 & IoT (MetroInd4.0&IoT), Rome, Italy, 7–9 June 2021; pp. 510–515.
89. Ma, J.; Chen, H.M.; Zhang, Y.; Guo, H.F.; Ren, Y.P.; Mo, R.; Liu, L.Y. A Digital Twin-Driven Production Management System for
Production Workshop. Int. J. Adv. Manuf. Technol. 2020, 110, 1385–1397. [CrossRef]
90. Moyne, J.; Qamsane, Y.; Balta, E.C.; Kovalenko, I.; Faris, J.; Barton, K.; Tilbury, D.M. A Requirements Driven Digital Twin
Framework: Specification and Opportunities. IEEE Access 2020, 8, 107781–107801. [CrossRef]
91. LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [CrossRef]
92. Meisheri, H.; Sultana, N.N.; Baranwal, M.; Baniwal, V.; Nath, S.; Verma, S.; Ravindran, B.; Khadilkar, H. Scalable Multi-Product
Inventory Control with Lead Time Constraints Using Reinforcement Learning. Neural Comput. Appl. 2021, 1, 1–23. [CrossRef]
93. Psarommatis, F.; Prouvost, S.; May, G.; Kiritsis, D. Product Quality Improvement Policies in Industry 4.0: Characteristics, Enabling
Factors, Barriers, and Evolution Toward Zero Defect Manufacturing. Front. Comput. Sci. 2020, 2, 26. [CrossRef]
Computers 2021, 10, 156 24 of 24
94. Lindström, J.; Lejon, E.; Kyösti, P.; Mecella, M.; Heutelbeck, D.; Hemmje, M.; Sjödahl, M.; Birk, W.; Gunnarsson, B. Towards
Intelligent and Sustainable Production Systems with a Zero-Defect Manufacturing Approach in an Industry 4.0 Context. Procedia
CIRP 2019, 81, 880–885. [CrossRef]
95. Nazarenko, A.A.; Sarraipa, J.; Camarinha-Matos, L.M.; Grunewald, C.; Dorchain, M.; Jardim-Goncalves, R. Analysis of Relevant
Standards for Industrial Systems to Support Zero Defects Manufacturing Process. J. Ind. Inf. Integr. 2021, 23, 100214. [CrossRef]
96. Psarommatis, F.; Zheng, X.; Kiritsis, D. A Two-Layer Criteria Evaluation Approach for Re-Scheduling Efficiently Semi-Automated
Assembly Lines with High Number of Rush Orders. Procedia CIRP 2021, 97, 172–177. [CrossRef]
97. Weichhart, G.; Mangler, J.; Raschendorfer, A.; Mayr-Dorn, C.; Huemer, C.; Hämmerle, A.; Pichler, A. An Adaptive System-of-
Systems Approach for Resilient Manufacturing. Elektrotechnik Und Inf. 2021, 138, 341–348. [CrossRef]

Computers 10 00156

Uploaded by

Copyright:

Available Formats

Computers 10 00156

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computers 10 00156

Uploaded by

Copyright:

Available Formats

computers

Computers 2021, 10, 156. https://doi.org/10.3390/computers10120156 https://www.mdpi.com/journal/computers

2.1. The Main Involved Concepts

Table 1. Definitions of the main concepts.

2.2. Literature Search

2.3. Thematic Analysis

2.4. Content Analysis

that semiconductor manufacturers are particularly exposed to the undesired amplification

3.1. Alignment Axes of the Proposal with I4.0 and SC4.0

3.2. Integrating the DT into the SC Context

3.3. Integrating the Physical and Virtual Environments of the DRL-Based DT

3.3. Integrating the Physical and Virtual Environments of the DRL-Based DT

3.4. Description of the DRL-Based Agent’s Learning and Prescription Processes

3.5. Proposal Summary

You might also like