Prediction of 5G System
Prediction of 5G System
Prediction of 5G System
ZIYU CHENG
ZIYU CHENG
Abstract
End-to-end delay measurement is deemed crucial for network models at all
times as it acts as a pivotal metric of the model’s effectiveness, assists in
delineating its performance ceiling, and stimulates further refinement and
enhancement. This premise holds true for 5G Core Network (5GC) models as
well. Commercial 5G models, with their intricate topological structures and
requirement for reduced latencies, necessitate an effective model to anticipate
each server’s current latency and load levels.
Consequently, the introduction of a model for estimating the present
latency and load levels of each network element server would be advantageous.
The central content of this article is to record and analyze the packet data
and CPU load data of network functions running at different user counts as
operational data, with the data from each successful operation of a service
used as model data for analyzing the relationship between latency and CPU
load. Particular emphasis is placed on the end-to-end latency of the PDU
session establishment scenario on two core functions - the Access and Mobility
Management Function (AMF) and the Session Management Function (SMF).
Through this methodology, a more accurate model has been developed to
review the latency of servers and nodes when used by up to 650, 000 end users.
This approach has provided new insights for network level testing, paving the
way for a comprehensive understanding of network performance under various
conditions. These conditions include strategies such as "sluggish start" and
"delayed TCP confirmation" for flow control, or overload situations where
the load of network functions exceeds 80%. It also identifies the optimal
performance range.
Keywords
Network model, End-to-end delay, 5GC, Prediction model, CPU load
ii | Abstract
Sammanfattning | iii
Sammanfattning
Latensmätningar för slutanvändare anses vara viktiga för nätverksmodeller
eftersom de fungerar som en måttstock för modellens effektivitet, hjälper till
att definiera dess prestandatak samt bidrar till vidare förfining och förbättring.
Detta antagande gäller även för 5G kärnnätverk (5GC). Kommersiella 5G-
nätverk med sin komplexa topologi och krav på låg latens, kräver en effektiv
modell för att prediktera varje servers aktuella last och latensbidrag.
Följdaktligen behövs en modell som beskriver den aktuella latensen och
dess beroende till lastnivå hos respektive nätverkselement. Arbetet består i
att samla in och analysera paketdata och CPU-last för nätverksfunktioner i
drift med olika antal slutanvändare. Fokus ligger på tjänster som används som
modelldata för att analysera förhållandet mellan latens och CPU-last. Särskilt
fokus läggs på latensen för slutanvändarna vid PDU session-etablering för två
kärnfunktioner – Åtkomst- och mobilitetshanteringsfunktionen (AMF) samt
Sessionshanteringsfunktionen (SMF).
Genom denna metodik har en mer exakt modell tagits fram för
att granska latensen för servrar och noder vid användning av upp till
650 000 slutanvändare. Detta tillvägagångssätt har givit nya insikter för
nätverksnivåtestningen, vilket banar väg för en omfattande förståelse för
nätverprestanda under olika förhållanden. Dessa förhållanden inkluderar
strategier som ”trög start” och ”fördröjd TCP bekräftelse” för flödeskontroll,
eller överlastsituationer där lasten hos nätverksfunktionerna överstiger 80%.
Det identifierar också det optimala prestandaområdet.
Nyckelord
Nätverksmodell, Slutanvändare, latens, 5GC, Modell för prediktering, CPU-
last
iv | Sammanfattning
Contents
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Research Methodology . . . . . . . . . . . . . . . . . . . . . 3
1.5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . 5
2 Background 6
2.1 5GS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 UE . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 RAN . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3 5GC . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Network function in 5GC . . . . . . . . . . . . . . . . . . . . 10
2.2.1 AMF . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1.1 RM-REGISTERED: Registered Status . . . 11
2.2.1.2 RM-DEREGISTERED: Deregistered Status 11
2.2.2 SMF . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 UPF . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.4 Other network functions . . . . . . . . . . . . . . . . 13
2.3 Scenario select . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.1 Service select . . . . . . . . . . . . . . . . . . . . . . 14
2.3.2 PDU Session Establishment . . . . . . . . . . . . . . 14
2.4 General Tools . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.1 Wireshark . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.2 PAC Manager . . . . . . . . . . . . . . . . . . . . . . 16
3 Research Methodology 17
3.1 Component of PDU Session Establishment . . . . . . . . . . 17
v
vi | CONTENTS
References 66
List of Figures
ix
x | LIST OF FIGURES
xi
xii | LIST OF TABLES
List of acronyms and abbreviations | xiii
5GS 5G System
E2E End-to-End
NF Network Function
UE User Equipment
Introduction
The 5G Core (5GC) is the core of the 5G mobile network and includes many
core functions, such as connectivity and mobility management, authentication
and authorization, user data management, and policy management, among
others [1, 2]. Ericsson has conducted development and extensive integration
and validation activities to deliver the industry’s best-performing 5G network,
including feature optimization. For these products, the most important way to
measure their performance is through end-to-end (E2E) latency measurements
[3]. However, many factors affect latency in 5G networks, and these factors
can be divided into two categories: signaling latency and payload latency.
Signaling latency can be understood as control plane latency, while payload
latency includes the process from User Equipment (UE) to gNodeB to the
User Plane. Although this may seem simple, it is more complex compared to
previous networks such as Long-Term Evolution (LTE), which also requires
lower latency.
Ericsson has integrated a test environment with corresponding software to
facilitate E2E testing. This allows a thorough representation of the entire end-
to-end process. Several specified International Mobile Subscriber Identities
(IMSIs) are used to perform different operations, such as registering a network
connection, authenticating, establishing a Protocol Data Unit (PDU) session,
switching locations or updating locations, and performing network slice
selection. Afterward, the records of each node can be obtained by capturing
packets, and it is possible to query the commands executed by each node to
retrace the entire process. Timestamps and Central Processing Unit (CPU)
load rates are also recorded. Thus, it can be verified that measuring latency
and monitoring load factors is feasible.
Therefore, it is necessary to explore the correlation between the network
1
2 | Introduction
load and the latency contribution of each 5GC network function. Once
we are able to monitor load levels and latency contributions, it will be
easier to understand which network size can provide the optimal end-user
latency, thereby enhancing the user experience. It’s worth noting that existing
latency analyses often treat 5GC as a part of the operator’s overall network,
but 5GC consists of many different network elements, each making distinct
contributions to the overall network [4].
1.1 Background
In the development of 5G networks, there are already more comprehensive
standards that serve as specifications within the industry. While 5G networks
continue to be the dominant discussion point in the industry [5], A. GUPTA
explores current hot topics such as 5G cellular network architectures [6],
large-scale multiple-input multiple-output technologies, and device-network
technologies. M. Skulysh et al. make improvements to 5G core gateways in
anticipation of placing some of their services in the cloud [7]. Y. Choi focuses
on Network Functions (NFs) in core networks [8], working on efficient data
path configuration when UEs are on the move.
Since the entire infrastructure is based on NFV, SDN, and network
slicing, it is necessary to understand articles related to these technologies.
Mijumbi et al. introduce the current use scenarios of NFV technology, and
it can be found that the combination of NFV and cloud technology is a
trend, which is more conducive to the flexible deployment of these NFs by
companies [9]. Michel. S et al. systematically introduce how SDN and
NFV technologies are combined to become a new paradigm, and proposes
a new integrated architecture [10]. Y. Wu et al. introduce the common
scenarios of network slicing for industrial IoT, analyzes the architecture, and
are helpful for understanding the integration of an NF across multiple network
slices in this thesis [11]. A. Khatouni et al. used three supervised learning
algorithms, Logistic Regression, Support Vector Machines, and Decision
Tree, for classification tasks to predict large amounts of real delay data [12],
but were limited to a 4G network model. In addition, Z. Hou also proposed
an algorithm for optimizing communication quality prediction in the case
of ultra-reliable and low-latency communications [13]. For CPU occupancy
prediction, M. Duggan et al. attempted to use recurrent neural networks for
training [14]. Thus, there is still room for development in combining the key
modules of the 5G core network with latency or CPU considerations.
Introduction | 3
2. What is the correlation between 5GS E2E latency and the network load?
3. Is there a load level for which we will have an optimal latency E2E?
1.3 Goals
The goal is to gather and analyze data from the 5GC network to determine if
a correlation can be made between latency and CPU/Memory utilization for
different end-user use cases and for various levels of background load. The
work also includes developing a method and a series of scripts for extracting
latency data for a set of 5GC network functions.
and vertical delay (delay within NFs). This step involves identifying the
inherent processing delay of a specific NF participating in the PDU session
establishment. Combining these delays gives the total delay of that NF. With
the original data at hand, statistical analysis of the delay becomes feasible.
Concurrently, in situations of high server load for a given network function, the
collection of failed packets and the packets indicating server load protection
activation allows not only an understanding of server operation under high
load conditions but also aids in preventing such situations.
Automatic acquisition of latency and CPU/memory utilization data is
based on historical data from servers across various network functions. The
data is formatted as JSON files, which facilitates analysis using Python,
ultimately producing CPU usage diagrams during latency testing.
Equipped with latency and CPU utilization curves, a correlation can
be established, generating corresponding scatter plots and facilitating curve
fitting. Polynomial fitting is used to obtain analytical approximation.
Typically, product testing often occurs under a specific load level, meaning
testing is performed with a fixed number of users due to the broad
nature of product testing. This thesis emphasizes a representative PDU
session establishment case involving several common network functions. By
conducting tests with diverse user counts, the depth of the product can be
demonstrated, thereby contributing valuable data to the product’s overall
evaluation.
1.5 Contributions
This study introduces a model that examines the relationship and behavior
between load rate (or user count) and network function latency. Firstly,
valuable information from different signals and servers is extracted, with key
data preserved in a CSV format. Secondly, the implementation of automated
analysis for different information effectively reduces the testing workload to
some extent, enabling more efficient analysis of the generated packets from
Wireshark. Finally, data fitting is performed, and representative data points
are analyzed to construct the model. These steps contribute to understanding
which network size can provide the best end-user latency, enhancing the
overall network performance and user experience.
Introduction | 5
Background
2.1 5GS
This section first briefly introduces the 5G system, then moves on to the 5GC
model discussed in this thesis, and finally provides a brief introduction to NFs
other than AMF and SMF.
The 5GS provides system-wide functions such as Mobile Broadband and
Voice Services which consist of UEs, the RAN, and the 5GC. 5GS supports
roaming scenarios, non-roaming scenarios, and interworking with legacy EPC
networks. Figure 2.1 shows its architecture.
For each part of 5GS, the user plane and the control plane are separate
6
Background | 7
in the 5GS. The user plane carries user traffic and the control plane carries
signaling traffic in the 5GS. The separation of the user and control planes
enables each plane resource to be scaled independently. The 5GC supports
parallel access to local and centralized services. For example, the control
plane can be centralized while the user plane with low-latency services can
be distributed in local data centers.
Another key trend of 5GS is deploying Network Function Virtualization
(NFV) services and Software Defined Infrastructure (SDI) on the cloud. This
is also implemented in the practical system of this thesis. NFV improves
the flexibility of the network by separating network functions from dedicated
hardware and implementing these functions in virtual machines or containers.
This allows network service providers to deploy and scale services on cloud
infrastructure rapidly. Meanwhile, SDI enables the deployment of network
services in a more dynamic, efficient, and cost-effective manner by abstracting
the hardware layer, making it programmable and capable of being manipulated
as per the application or system needs. With these two technologies as the
fundamental architecture, 5GS is moving towards a service-based structure.
This architecture allows network functions and services to be software-driven,
flexibly deployed, and managed on the cloud, thus achieving efficient operation
of the network and optimizing resource utilization.
This thesis mainly focuses on 5GC. So the part in UE and RAN are
introduced briefly.
2.1.1 UE
UE refers to a user device, such as a cell phone, a modem, or a laptop, that is
capable of performing various functions, such as making phone calls, surfing
the internet, sending emails, and more. In 5G networks, UE can also support
industrial [15], traffic [16], and medical applications [17].
UE communicates with RAN by sending and receiving data, and it serves
as both the initiator and the terminal for a service. If UE needs to initiate
a request to the core network, it goes through a negotiation process between
RAN and the core network. The basic process can be described as below:
2.1.2 RAN
RAN (Radio Access Network) includes base stations (eNodeB in LTE and
gNodeB in 5G), wireless channels, and other facilities. RAN provides wireless
access and data transmission services to UE, manages wireless resources
and security, manages and maintains base station equipment, and optimizes
and improves the network [18, pp. 82-84]. As previously mentioned, UE
interacts with RAN. After UE is connected to the mobile communication
system through RAN, RAN decodes, encodes, and converts signals sent by
UE and transmits them to the core network.
For the part related to UE, RAN provides wireless access and
data transmission services and also requires security management and
authentication of mobile devices, in addition to allocating wireless resources.
RAN itself also has functions such as cell partitioning, power control,
interference management, etc.
The interaction between RAN and the core network involves the following
processes:
• Decoding the information sent by UE, and then encoding it for error
checking and data recovery.
2.1.3 5GC
5GC is defined as service-based architecture. The system functionality in
the service-based architecture is achieved with network functions that provide
service access to each other. Network functions are components of the network
infrastructure with specific functional behavior, such as mobility and session
handling. Network functions are made up of network function services [18,
pp. 79-82]. The discussion of AMF and SMF will be independent sections
since it is our topic.
Interactions between network functions are done in two ways. One is
reference point representation, and the other is service-based representation.
Reference point representation uses a defined point-to-point reference to
represent an interface between two network functions. In service-based
representation, network functions in the control plane act as service providers
by providing services through their service-based interfaces to other network
functions. When these services are used by other networks, those network
functions are acting as service consumers.
10 | Background
Namf AMF exhibit and provide services to other network functions, such as
the SMF and other AMFs.
Background | 11
N2 Between the AMF and the RAN connected to the UE. NGAP messages
are transferred between the AMF and the RAN.
2.2.2 SMF
The SMF sets up and manages sessions according to the network policy.
Session management procedures establish and handle connections between
the UE and one or more data networks. Through SMF services, consumer
network functions control PDU session events. The SMF selects and controls
the UPF, and also manages traffic steering at the UPF to route traffic toward
the final destination. It also supports charging PDU sessions [18, pp. 81].
SMF supports session management procedures for 5GS, EPS, and
untrusted Non-3GPP networks. For PDU sessions, it supports 5GS and PDN
connections in the EPS.
Nsmf SMF offers the PDU session service which allow other NFs to establish,
modify and release the PDU sessions over the Nsmf SBI to the AMF.
N4 Through the N4, the SMF controls the UPF by creating, updating, and
removing the N4 session context in the UPF.
2.2.3 UPF
UPF handles the user plane path of PDU sessions and supports packet routing,
inspection, and traffic reporting [18, pp. 84-85]. For this thesis, its function
Background | 13
of processing and forwarding the user packets, and the ability to control and
optimize data traffic are important.
As Figure 2.5 shows the UPF supports some of the important interfaces:
NSSF Supports the selection of network slices and AMF sets for the UE.
Research Methodology
17
18 | Research Methodology
With this basic understanding, the latency for AMF can be illustrated as
shown in Figure 3.1.
For the AMF in the graph, there are two main segments of latency:
3.1.1.1 Latency 1
This segment starts from the AMF receiving the "PDU Session
Establishment Request" forwarded by the RAN to the AMF sending the
"Nsmf_PDUSession_CreateSMContext Request" to the SMF, marking the
beginning of the SMF section.
During this process, the AMF goes through the following steps:
1. Validate the integrity and accuracy of the received message. This
includes the RAN passing a series of UE information, such as UE
Radio Capability and UE Radio Capability for Paging, as well as initial
scenario settings, among others. This not only facilitates resource
allocation behavior by the AMF but also allows the AMF to check the
basic information of the UE.
2. Allocate resources for the upcoming session, such as the Routing
Indicator containing routing information for forwarding the session to
the appropriate destination, specifying the data network to which the
PDU session belongs, selecting the corresponding network slice (S-
NSSAI) assigning a unique ID to the session, specifying the request
type, and assigning the corresponding SMF, DNN and UE related
information such as SUPI (IMSI included), among others.
3. Communicate with the selected SMF to request the creation of the
corresponding SMContext to support the management and control of
this session.
Research Methodology | 19
3.1.1.2 Latency 2
This segment includes the process from the AMF receiving the
"Namf_Communication_N1N2_MessageTransfer Request" from the SMF
to sending the "Namf_Communication_N1N2_MessageTransfer Response"
to the SMF and sending the "PDU Session Resource Setup Request" to the
RAN.
During this process, the AMF goes through the following steps:
1. Sending a response to the SMF: The AMF parses the received request
message and generates the corresponding response message. The
request from the SMF includes an N1 PDU Session Establishment
Accept and an N2 PDU Session Resource Setup Request Transfer
message. By sending this response message to the SMF, the AMF
notifies the SMF of the reception and processing status of the message.
2. Sending a request to the RAN: After receiving the request from the SMF,
the AMF needs to communicate with the RAN to establish the resources
for the PDU Session. The AMF sends an N2 PDU Session Resource
Setup Request message to the RAN, including the N2 SM information
and NAS message. This request is used to set up an Access Network
resource.
3.1.2.1 Latency 1
This segment covers the period from when the AMF sends the
"Nsmf_PDUSession_CreateSMContext Request" to the SMF until the
SMF sends the "Nudm_SubscriberDataManagement_Get Request" to the
UDM.
For the SMF, this process involves the following steps:
3. Sending a GET request to the UDM: The SMF forwards the UE’s SUPI,
S-NSSAI, and DNN parameters to the UDM. The SMF aims to retrieve
subscriber subscription data and configuration information from the
UDM to support subsequent session control and service provisioning.
Research Methodology | 21
3.1.2.2 Latency 2
This segment includes the period from when the UDM sends the
"Nudm_SubscriberDataManagement_Get Response" to the SMF until the
SMF sends the "Nudm_SubscriberDataManagement_Subscribe Request"
back to the UDM.
For the SMF, this process involves the following steps:
1. Receiving user data and relevant information from the UDM: The SMF
may perform updates and configurations to reflect the subscriber’s latest
status in the session.
3.1.2.3 Latency 3
This segment includes the period from the UDM’s response
"Nudm_SubscriberDataManagement_Subscribe Response" to the SMF
until the SMF sends the "Nsmf_PDUSession_CreateSMContext Response"
to the AMF and the "PFCP Session Establishment Request" to the UPF.
For the SMF, this process involves the following steps:
1. The SMF becomes aware that there has been a subscription update at
the UDM.
3. The SMF queries the NRF again, this time for UPF selection
based on local configuration. The request is made to discover
Npcf_SMPolicyControl service instances. Upon the NRF’s response,
the SMF, in coordination with the PCF, selects a QoS flow based on
the required default policies, charging rules, and Quality of Service
22 | Research Methodology
3.1.2.4 Latency 4
This segment spans from when the UPF sends the "PFCP Session
Establishment Response" to the SMF until the SMF sends the
"Namf_Communication_N1N2_MessageTransfer Request" to the AMF.
For the SMF, this process involves the following steps:
1. The SMF receives the response from the UPF, confirming the
establishment of relevant tunnels and session.
2. The SMF requests the RADIUS server to start the accounting session
and receives a corresponding response.
3. The SMF also queries the NRF and obtains a list of available
Nchf_ConvergedCharging service instances for charging management.
The SMF establishes communication with the CHF and starts billing
data.
3.1.3.1 Latency
This segment spans from when the SMF sends the "PFCP Session
Establishment Request" to the UPF until the UPF sends the "PFCP Session
Establishment Response" to the SMF.
For the UPF, this process involves the following steps:
1. Upon receiving the request from the SMF, the UPF allocates the
necessary resources for the session based on the provided QoS and
routing information. This ensures the availability of resources and
facilitates the establishment of tunnels for the session.
2. The UPF establishes the user plane path for the PDU session based on
the given information. This involves setting up a unidirectional tunnel
from the gNode to the UPF to facilitate the flow of user data.
3. The UPF updates the tunnel information to the SMF, including its IP
address and other relevant parameters. This information is crucial for the
SMF to identify and manage the session, as well as to facilitate further
control and communication with the UPF.
24 | Research Methodology
3.2.2.2 NFs
Each network function has logs to record the UE activities generated by the
Traffic Generator, usually in XML format. When a portion of the users
becomes active on the Traffic Generator side, these users can be considered
as a nearly fixed load appearing in each network function. By changing the
number of active users, the load on each network function can be controlled.
However, it is not feasible to record logs for all active users in the background
as it would impose a heavy load on the server and generate a lot of unnecessary
data. Therefore, a specific UE can be registered in each network function,
which allows displaying only the information of that UE across the network
functions. Additionally, the actual number of background users is much larger
than that specific UE, and multiple sets of servers in a network function are
used to share the load of such a large number of users. Hence, the impact of
this specific UE on the overall CPU can be considered negligible.
Once this UE is registered on each network function, tests can be
performed on the Traffic Generator side, with the ability to change the number
of test iterations and the test content. Enabling the monitoring functionality
on each network function allows obtaining the corresponding records.
26 | Research Methodology
• Message ID
• Timestamp
• Message protocol
As for message types, 3GPP has defined the contents as shown in Figure
3.5. All message types for AMF and SMF discussed in this document are
included in the Table 3.2 and Table 3.1.
instantaneous log, but formal logs can span a longer duration, resulting in more
key-value pairs in the "value" section for better record-keeping. The useful
information for us here includes:
• Value section contains a pair of values: the test time and the utilization
rate on this server.
element, and the file formats are also different. The overall execution flow
is illustrated in figure 3.7.
information. Afterward, there are some editable filters kept in CSV files. The
filters are based on strings, message types, protocol types, key-value types,
etc., from the relevant information for filtering. Irrelevant or error-related
packets are filtered out. The useful information is stored in a file like PDU-
AMF_10_0520_262800000499999.csv, while the irrelevant or error-related
packets are stored in PDU-AMF_10_0520_262800000499999_Nonset.csv.
The stored information includes the following:
• Source destination
Research Methodology | 31
• Target destination
• Protocol
In this section, the focus is on processing the JSON files. First, the
corresponding file is read, similar to what was shown in the previous figure
3.6. The "metric" corresponds to a dictionary containing various metrics,
and the "value" is read as a list containing a timestamp and a value. Next,
the dictionary is iterated through to find the key "workload_name". If the
corresponding element is found, the corresponding "value" is stored in a
new list. If it is not found, the program exits. After the iteration, there are
usually 8 groups of servers for AMF and 5 groups for SMF. They have the
32 | Research Methodology
same timestamps and slightly different CPU utilization values, but due to
load-balancing strategies, the CPU utilization is generally similar within each
group. Therefore, the average CPU utilization is calculated, and the previous
timestamps are converted to relative time for ease of merging with the latency
file to calculate CPU utilization at each moment. This list is then stored in a
file with a name like: AMF_5.log.
out the appropriate service content, protocol, and specific message IDs. There
may still be missing files in this CSV. Therefore, the purpose of this step is to
filter out the missing files and create a separate statistics file for them. And the
process is as Figure 3.11 shown.
The entire file is read and stored as a dictionary. First, filtering is performed
based on message type, message header number, and protocol type. For
34 | Research Methodology
2 and order 1 for AMF, and between order 7 and order 4 for the second
interval, are marked as separate columns. The total latency is calculated as
well. Additionally, basic statistics of the samples are calculated, including the
number of valid samples, maximum value, minimum value, and average value.
Especially for SMF, where TCP delay ACK exists, calculating the percentile
values of these data can also reflect their characteristics. These statistical
values are included in the second page of the results, "Statistic result."
Similarly, at the beginning, another dictionary is used to read the invalid
samples. The basic failure type in the invalid samples is HTTP fault header.
Therefore, a list of HTTP fault headers is prepared and iterated over. If an
HTTP fault header is found, it is counted and stored in a separate dictionary.
If not found, the process exits. In the end, the error samples are included in
the "Fault Sample" page, and the missing samples are included in the "Missing
Sample" page, both of which are part of the "Miss and Fault statistic" page.
37
38 | Results and analysis
Overall, the data suggests that as the number of IMSIs increases, there is
a gradual increase in CPU load and response times, particularly in the case of
maximum response times. The variation in average and percentile response
times is relatively minimal.
Figure 4.2: Fitting latency and corresponding CPU load for AMF.
Figure 4.2 illustrates the latency-load curve of the AMF, which exhibits a
U-shaped pattern with the left side displaying lower latency compared to
the right side. This pattern can be attributed to two main factors. Initially,
the gradually decreasing part with lower latency is due to the slow-start
mechanism implemented at the beginning of network communication. As the
sender gradually increases the data transmission rate, the network load remains
relatively low, leading to lower latency. However, as the load increases, the
latency rises due to the increased demand for network resources. Therefore,
polynomial fitting, specifically up to quadratic order, is qualified to analyze
the data.
40 | Results and analysis
We can simplify the subsequent analysis into four parts: the slow start
phase, the peak efficiency phase, the load escalation phase, and the overload
phase.
It can be observed that the curve for the p50 percentile is not particularly
distinct due to the limited amount of retained data in Figure 4.3. Although
the overall trend can still be reflected, it is not prominently manifested. In
this fit, the maximum value is in small latency at the highest CPU load, while
the minimum value is slightly smaller than the current maximum latency at a
medium CPU load.
The figures of P75 already clearly reflect the overall trend in Figure 4.4.
In this fit, the maximum value is nearly reached to medium latency at the
highest CPU load, while the minimum value is a small latency at a slightly
less medium CPU load than P50.
The graph for the P95 percentile is the most representative in Figure 4.6,
as it reflects the significant impact of slow start and high load on the latency of
the AMF, resulting in increased latency values. In this fit, the maximum value
is a medium latency at the highest CPU load, while the minimum value is in a
low range of latency at a moderate CPU load.
Results and analysis | 43
So in total, the maximum values for AMF always reach the maximum CPU
load, but the minimum values for AMF are around a moderate load level or
medium load level.
• Message retransmission:
As shown in Figure 4.12, this is the repetition of two identical messages
sent from AMF to SMF, where the second message is sent one second
after the first. However, only the second message receives a reply. This
also can be attributed to packet loss or delay.
• DL NAS transport (Payload was not forwarded): This indicates that the
AMF is experiencing congestion and high load, resulting in the payload
not being correctly processed and forwarded.
IMSIs num. CPU load level Actual Sessions Loss sessions Loss rate(%)
0–3 × 105 Low/Moderate / / /
3.5 × 105 Medium 958 7 0.73
3.7 × 105 Medium 510 9 1.76
4 × 105 Medium to High 1161 104 8.96
4.5 × 105 Medium to High 1404 343 24.43
5 × 105 High 418 154 36.84
5.5 × 105 High 638 224 35.11
6 × 105 High 342 140 40.94
6.5 × 105 High 134 119 88.81
• The response time of the SMF does not show a significant fluctuation
trend in most cases, generally falling within the range of medium-range
latency.
• When the number of IMSIs reaches 450, 000 and above, the response
times begin to rise significantly, indicating that the load pressure on the
relevant pods of the SMF also increases significantly.
• The maximum delay of the SMF often has a significant difference from
the average and the minimum, which is caused by the TCP Delayed
Acknowledgment strategy. However, the mitigated average response
time under low load is also due to this strategy.
Figure 4.16: Fitting latency and corresponding CPU load for SMF.
As in Figure 4.17, the scatter plot results in P50 are relatively balanced.
While there may be minor variations in the fitted curve, the overall latency
Results and analysis | 53
The data for P90 reflects the relationship between latency and load as
Figure 4.19 shown. However, the scatter plot does not include packets that
are significantly affected by TCP delay acknowledgments. In this fit, the
maximum value is in a large range at the highest CPU load, while the minimum
value is a small latency at a much higher than P75 but still low CPU load.
Results and analysis | 55
The scatter plot for P95 includes many samples that are affected by TCP
delay acknowledgments in Figure 4.20. And we can observe that when the
load is low, although the previous curves of P50 and P75 are not too affected,
the curve of P95 is affected by some TCP delays at this time, probably around
the moderate level of the load before it is not too affected. These samples
are present in a certain proportion and are widely distributed across various
CPU loads. As a result, the fitting curve for P95 is noticeably higher than the
other fitting curves. In this fit, the maximum value is a large range latency at
the highest CPU load, while the minimum value is a small range latency at a
slightly higher than P90 but still low CPU load.
56 | Results and analysis
So in total, the maximum values for SMF always reach the maximum CPU
load around the highest CPU load we set, but the minimum values for SMF
vary in low to moderate load levels.
delayed acknowledgments. The most notable change is that the dashed line
representing P90 gradually shifts from the back to the front
IMSIs num. CPU load level Actual Sessions Loss sessions Loss rate(%)
0–3.5 × 105 Low to Medium / / /
3.7 × 105 Medium 510 30 5.88
4 × 105 Medium 1054 205 19.45
4.5 × 105 Medium to High 1395 526 37.71
5 × 105 Medium to High 338 132 39.05
5.5 × 105 Medium to High 552 215 38.95
6 × 105 Medium to High 270 122 45.19
6.5 × 105 Medium to High 45 26 57.78
Table 4.4: Combined Fault Rate Table. IMSIs are in units of 105 . AS: Actual
Sessions, F404S: Fault(404) sessions, F404R: Fault(404) rate(%), F503S:
Fault(503) sessions, F503R: Fault(503) rate(%), F504S: Fault(504) sessions,
F504R: Fault(504) rate(%), F500S: Fault(500) sessions, F500R: Fault(500)
rate(%).
Results and analysis | 61
"504 Gateway Time-out" indicates that SMF times out while completing a
request as Figure 4.25 shown. This often occurs when AMF initiates a request
to SMF at the beginning. SMF, being under high load, fails to respond in a
timely manner, exceeding the predefined waiting time for AMF and resulting
in this error.
"500 Internal Server Error" may indicate insufficient resources in SMF to
fulfill the request processing as Figure 4.26 shown. As this occurrence is rare
and happens when SMF is under high load, it is unlikely to be a software or
code issue.
Request and Response within the UPF but also the time taken for requests sent
from the SMF and sent back to the SMF. Due to some malfunctions in UPF
trace log functionality, it was not possible to conduct independent testing, so
this test was approached from the perspective of the SMF.
5.1 Conclusion
This thesis aims to measure and analyze the performance of some core
Network Functions in 5GC networks’ practical scenario: PDU session
establishment. By understanding the actual performance of these NFs,
particularly in terms of latency and resource utilization, we can determine if
the deployed NFs are functioning properly and assess their performance in
real-world applications. Importantly, by analyzing the latency contribution
of these network functions, we can gain insights into the overall network
operation and make timely adjustments to relevant parameters in order to
provide the best user experience.
To carry out the practical measurements, we gradually increase the number
of active users in the Traffic Generator while observing the performance
changes of various network functions and recording the results. After
documenting these functionalities, we organize the results and generate the
latency and servers’ load performance curves for the AMF and SMF, and
we analyze the variation in latency of the UPF under different levels of user
activity. This information is beneficial for practical testing and application.
Therefore, in terms of overall outcomes, this thesis discusses how to
infer the latency contribution of each network function to the entire network,
achieved through running the same scenario tests under varying loads. If these
network devices can respond to and process user requests in a normal and
stable manner, we can deduce their dependence on network loads, i.e., the
range of loads under different scenarios. Regarding the relationship between
E2E latency and network load, in general, as the load increases, the end-
to-end latency of network functions also increases. Notably, a significant
64
Conclusion and Future Work | 65
5.2 Limitations
Another issue is that the UPF trace log functionality encountered some errors,
making it difficult to measure the load of the UPF. Additionally, the latency
of the UPF can only be observed through the SMF, although there may be
intermediate transfer time. Nevertheless, it is evident that the latency of the
UPF becomes significantly long, posing challenges for the analysis in this
aspect.
Increasing the number of data points across various load levels would
contribute to a more comprehensive analysis.
67
68 | REFERENCES
[8] Y.-i. Choi and N. Park, “Slice architecture for 5g core network,” in
2017 Ninth International Conference on Ubiquitous and Future Networks
(ICUFN), 2017. doi: 10.1109/ICUFN.2017.7993854 pp. 571–575.
www.kth.se