Prediction of 5G System

Download as pdf or txt
Download as pdf or txt
You are on page 1of 88

Degree Project in Electrical Engineering, specializing in Communication Systems

Second cycle 30.0 credits

Prediction of 5G system latency


contribution for 5GC network
functions

ZIYU CHENG

Stockholm, Sweden 2023


Prediction of 5G system
latency contribution for 5GC
network functions

ZIYU CHENG

Master’s Programme, Communication Systems, 120 credits


Date: October 12, 2023

Supervisors: Sara Klug, Ozan Alp Topal


Examiner: Ben Slimane
School of Electrical Engineering and Computer Science
Host company: Ericsson AB
Swedish title: Förutsägelse av 5G-systemets latensbidrag för
5GC-nätverksfunktioner
© 2023 Ziyu Cheng
Abstract | i

Abstract
End-to-end delay measurement is deemed crucial for network models at all
times as it acts as a pivotal metric of the model’s effectiveness, assists in
delineating its performance ceiling, and stimulates further refinement and
enhancement. This premise holds true for 5G Core Network (5GC) models as
well. Commercial 5G models, with their intricate topological structures and
requirement for reduced latencies, necessitate an effective model to anticipate
each server’s current latency and load levels.
Consequently, the introduction of a model for estimating the present
latency and load levels of each network element server would be advantageous.
The central content of this article is to record and analyze the packet data
and CPU load data of network functions running at different user counts as
operational data, with the data from each successful operation of a service
used as model data for analyzing the relationship between latency and CPU
load. Particular emphasis is placed on the end-to-end latency of the PDU
session establishment scenario on two core functions - the Access and Mobility
Management Function (AMF) and the Session Management Function (SMF).
Through this methodology, a more accurate model has been developed to
review the latency of servers and nodes when used by up to 650, 000 end users.
This approach has provided new insights for network level testing, paving the
way for a comprehensive understanding of network performance under various
conditions. These conditions include strategies such as "sluggish start" and
"delayed TCP confirmation" for flow control, or overload situations where
the load of network functions exceeds 80%. It also identifies the optimal
performance range.

Keywords
Network model, End-to-end delay, 5GC, Prediction model, CPU load
ii | Abstract
Sammanfattning | iii

Sammanfattning
Latensmätningar för slutanvändare anses vara viktiga för nätverksmodeller
eftersom de fungerar som en måttstock för modellens effektivitet, hjälper till
att definiera dess prestandatak samt bidrar till vidare förfining och förbättring.
Detta antagande gäller även för 5G kärnnätverk (5GC). Kommersiella 5G-
nätverk med sin komplexa topologi och krav på låg latens, kräver en effektiv
modell för att prediktera varje servers aktuella last och latensbidrag.
Följdaktligen behövs en modell som beskriver den aktuella latensen och
dess beroende till lastnivå hos respektive nätverkselement. Arbetet består i
att samla in och analysera paketdata och CPU-last för nätverksfunktioner i
drift med olika antal slutanvändare. Fokus ligger på tjänster som används som
modelldata för att analysera förhållandet mellan latens och CPU-last. Särskilt
fokus läggs på latensen för slutanvändarna vid PDU session-etablering för två
kärnfunktioner – Åtkomst- och mobilitetshanteringsfunktionen (AMF) samt
Sessionshanteringsfunktionen (SMF).
Genom denna metodik har en mer exakt modell tagits fram för
att granska latensen för servrar och noder vid användning av upp till
650 000 slutanvändare. Detta tillvägagångssätt har givit nya insikter för
nätverksnivåtestningen, vilket banar väg för en omfattande förståelse för
nätverprestanda under olika förhållanden. Dessa förhållanden inkluderar
strategier som ”trög start” och ”fördröjd TCP bekräftelse” för flödeskontroll,
eller överlastsituationer där lasten hos nätverksfunktionerna överstiger 80%.
Det identifierar också det optimala prestandaområdet.

Nyckelord
Nätverksmodell, Slutanvändare, latens, 5GC, Modell för prediktering, CPU-
last
iv | Sammanfattning
Contents

1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Research Methodology . . . . . . . . . . . . . . . . . . . . . 3
1.5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . 5

2 Background 6
2.1 5GS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 UE . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 RAN . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3 5GC . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Network function in 5GC . . . . . . . . . . . . . . . . . . . . 10
2.2.1 AMF . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1.1 RM-REGISTERED: Registered Status . . . 11
2.2.1.2 RM-DEREGISTERED: Deregistered Status 11
2.2.2 SMF . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 UPF . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.4 Other network functions . . . . . . . . . . . . . . . . 13
2.3 Scenario select . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.1 Service select . . . . . . . . . . . . . . . . . . . . . . 14
2.3.2 PDU Session Establishment . . . . . . . . . . . . . . 14
2.4 General Tools . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.1 Wireshark . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.2 PAC Manager . . . . . . . . . . . . . . . . . . . . . . 16

3 Research Methodology 17
3.1 Component of PDU Session Establishment . . . . . . . . . . 17

v
vi | CONTENTS

3.1.1 Contributions in AMF . . . . . . . . . . . . . . . . . 18


3.1.1.1 Latency 1 . . . . . . . . . . . . . . . . . . 18
3.1.1.2 Latency 2 . . . . . . . . . . . . . . . . . . 19
3.1.2 Contributions in SMF . . . . . . . . . . . . . . . . . 20
3.1.2.1 Latency 1 . . . . . . . . . . . . . . . . . . 20
3.1.2.2 Latency 2 . . . . . . . . . . . . . . . . . . 21
3.1.2.3 Latency 3 . . . . . . . . . . . . . . . . . . 21
3.1.2.4 Latency 4 . . . . . . . . . . . . . . . . . . 22
3.1.3 Contribution in UPF . . . . . . . . . . . . . . . . . . 22
3.1.3.1 Latency . . . . . . . . . . . . . . . . . . . 23
3.2 Collection method . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.1 System environment descriptions . . . . . . . . . . . 24
3.2.2 System working mechanisms . . . . . . . . . . . . . . 24
3.2.2.1 Traffic Generator . . . . . . . . . . . . . . . 25
3.2.2.2 NFs . . . . . . . . . . . . . . . . . . . . . 25
3.2.3 Collecting process . . . . . . . . . . . . . . . . . . . 26
3.3 Measurement infrastructure . . . . . . . . . . . . . . . . . . . 26
3.3.1 Example message . . . . . . . . . . . . . . . . . . . . 26
3.3.2 Example CPU log . . . . . . . . . . . . . . . . . . . 27
3.3.3 Collect and Preprocess Message . . . . . . . . . . . . 28
3.3.3.1 Latency collecting and preprocessing . . . . 29
3.3.3.2 Recording the CPU utilization and
preprocessig . . . . . . . . . . . . . . . . . 31
3.4 Statistic and modeling . . . . . . . . . . . . . . . . . . . . . . 32
3.4.1 Statistical Analysis . . . . . . . . . . . . . . . . . . . 32
3.4.1.1 Latency’s statistic process . . . . . . . . . . 32
3.4.1.2 CPU utilization’s statistic process . . . . . . 35
3.4.2 Model construction . . . . . . . . . . . . . . . . . . . 36

4 Results and analysis 37


4.1 Results in AMF . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.1.1 General Results . . . . . . . . . . . . . . . . . . . . . 38
4.1.1.1 Fitting Latency-Corresponding CPU Load
Result . . . . . . . . . . . . . . . . . . . . 39
4.1.1.2 Fitting Latency-Corresponding CPU Load
Distributions . . . . . . . . . . . . . . . . . 40
4.1.1.2.1 The P50 Latency-CPU load distributions
analysis . . . . . . . . . . . . . . . . . . . . 40
CONTENTS | vii

4.1.1.2.2 The P75 Latency-CPU load distributions


analysis . . . . . . . . . . . . . . . . . . . . 40
4.1.1.2.3 The P90 Latency-CPU load distributions
analysis . . . . . . . . . . . . . . . . . . . . 41
4.1.1.2.4 The P95 Latency-CPU load distributions
analysis . . . . . . . . . . . . . . . . . . . . 42
4.1.2 Latency Analysis . . . . . . . . . . . . . . . . . . . . 43
4.1.2.1 Slow Start Phase . . . . . . . . . . . . . . . 43
4.1.2.2 Peak Efficiency Phase . . . . . . . . . . . . 44
4.1.2.3 Load Escalation Phase . . . . . . . . . . . . 45
4.1.2.4 Overload Phase . . . . . . . . . . . . . . . 46
4.1.3 Overload Mode in AMF . . . . . . . . . . . . . . . . 47
4.1.4 Fault and Loss Analysis . . . . . . . . . . . . . . . . 48
4.2 Results in SMF . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2.1 General Results . . . . . . . . . . . . . . . . . . . . . 50
4.2.1.1 Fitting Latency-Corresponding CPU Load
Result . . . . . . . . . . . . . . . . . . . . 52
4.2.1.2 Fitting Latency-Corresponding CPU Load
Distributions . . . . . . . . . . . . . . . . . 52
4.2.1.2.1 The P50 Latency-CPU load distributions
analysis . . . . . . . . . . . . . . . . . . . . 52
4.2.1.2.2 The P75 Latency-CPU load distributions
analysis . . . . . . . . . . . . . . . . . . . . 53
4.2.1.2.3 The P90 Latency-CPU load distributions
analysis . . . . . . . . . . . . . . . . . . . . 54
4.2.1.2.4 The P95 Latency-CPU load distributions
analysis . . . . . . . . . . . . . . . . . . . . 55
4.2.2 Latency Analysis . . . . . . . . . . . . . . . . . . . . 56
4.2.2.1 Beginning Phase . . . . . . . . . . . . . . . 56
4.2.2.2 Peak Efficiency Phase . . . . . . . . . . . . 57
4.2.2.3 Load Escalation Phase . . . . . . . . . . . . 58
4.2.3 Fault and Loss Analysis . . . . . . . . . . . . . . . . 59
4.3 Results in UPF . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3.1 General Results . . . . . . . . . . . . . . . . . . . . . 62
4.3.2 Latency Analysis . . . . . . . . . . . . . . . . . . . . 62

5 Conclusion and Future Work 64


5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
viii | Contents

5.3 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

References 66
List of Figures

2.1 5G system architecture. . . . . . . . . . . . . . . . . . . . . . 6


2.2 5G Core Architecture . . . . . . . . . . . . . . . . . . . . . . 10
2.3 AMF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 SMF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 UPF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6 PDU Session Establishment . . . . . . . . . . . . . . . . . . 15

3.1 AMF latency contributions. . . . . . . . . . . . . . . . . . . . 19


3.2 SMF Latency Contributions . . . . . . . . . . . . . . . . . . 23
3.3 UPF Latency Contribution . . . . . . . . . . . . . . . . . . . 24
3.4 Sample XML log. . . . . . . . . . . . . . . . . . . . . . . . . 26
3.5 Example of NF services. . . . . . . . . . . . . . . . . . . . . 27
3.6 Sample CPU log. . . . . . . . . . . . . . . . . . . . . . . . . 28
3.7 Collect and preprocess message. . . . . . . . . . . . . . . . . 29
3.8 Latency’s collecting and preprocess. . . . . . . . . . . . . . . 30
3.9 CPU utilization’s collecting and Preprocess. . . . . . . . . . . 31
3.10 Statistic process . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.11 Latency’s statistic process . . . . . . . . . . . . . . . . . . . . 33
3.12 Latency’s statistic process example. . . . . . . . . . . . . . . 34
3.13 CPU utilization’s statistic process . . . . . . . . . . . . . . . . 35

4.1 Pods in AMF. . . . . . . . . . . . . . . . . . . . . . . . . . . 38


4.2 Fitting latency and corresponding CPU load for AMF. . . . . . 39
4.3 Scatter and Fitting in P50 for AMF. . . . . . . . . . . . . . . . 40
4.4 Scatter and Fitting in P75 for AMF. . . . . . . . . . . . . . . . 41
4.5 Scatter and Fitting in P90 for AMF. . . . . . . . . . . . . . . . 42
4.6 Scatter and Fitting in P95 for AMF. . . . . . . . . . . . . . . . 43
4.7 Slow Start Phase. . . . . . . . . . . . . . . . . . . . . . . . . 44
4.8 Peak Efficiency Phase. . . . . . . . . . . . . . . . . . . . . . 45
4.9 Load Escalation Phase. . . . . . . . . . . . . . . . . . . . . . 46

ix
x | LIST OF FIGURES

4.10 Deregistration from UE in Overload Phase. . . . . . . . . . . 46


4.11 Overload Phase. . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.12 Message retransmission. . . . . . . . . . . . . . . . . . . . . 48
4.13 404 not found example in AMF. . . . . . . . . . . . . . . . . 49
4.14 CPU load around 650,000 IMSIs. . . . . . . . . . . . . . . . . 51
4.15 An example PDU session establishment test in SMF around
650,000 IMSIs. . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.16 Fitting latency and corresponding CPU load for SMF. . . . . . 52
4.17 Scatter and Fitting in P50 for SMF. . . . . . . . . . . . . . . . 53
4.18 Scatter and Fitting in P75 for SMF. . . . . . . . . . . . . . . . 54
4.19 Scatter and Fitting in P90 for SMF. . . . . . . . . . . . . . . . 55
4.20 Scatter and Fitting in P95 for SMF. . . . . . . . . . . . . . . . 56
4.21 Normal Phase. . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.22 Peak Efficiency Phase. . . . . . . . . . . . . . . . . . . . . . 58
4.23 High Load Phase. . . . . . . . . . . . . . . . . . . . . . . . . 59
4.24 404 Example in SMF . . . . . . . . . . . . . . . . . . . . . . 61
4.25 503 and 504 Example in SMF. . . . . . . . . . . . . . . . . . 61
4.26 500 Example in SMF. . . . . . . . . . . . . . . . . . . . . . . 61
4.27 Around 100,000 Users in UPF. . . . . . . . . . . . . . . . . . 62
4.28 Around 200,000 Users in UPF. . . . . . . . . . . . . . . . . . 63
4.29 Around 400,000 Users in UPF. . . . . . . . . . . . . . . . . . 63
List of Tables

3.1 AMF interfaces, targets, and protocols. . . . . . . . . . . . . . 18


3.2 SMF interfaces, targets, and protocols. . . . . . . . . . . . . . 20
3.3 UPF interfaces, targets, and protocols. . . . . . . . . . . . . . 22

4.1 Loss rate in AMF. . . . . . . . . . . . . . . . . . . . . . . . . 49


4.2 Fault rate in AMF. . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3 Loss rate in SMF. . . . . . . . . . . . . . . . . . . . . . . . . 60
4.4 Combined Fault Rate Table. IMSIs are in units of 105 .
AS: Actual Sessions, F404S: Fault(404) sessions, F404R:
Fault(404) rate(%), F503S: Fault(503) sessions, F503R:
Fault(503) rate(%), F504S: Fault(504) sessions, F504R:
Fault(504) rate(%), F500S: Fault(500) sessions, F500R:
Fault(500) rate(%). . . . . . . . . . . . . . . . . . . . . . . . 60

xi
xii | LIST OF TABLES
List of acronyms and abbreviations | xiii

List of acronyms and abbreviations


5GC 5G Core Network

5GS 5G System

AMF Access and Mobility Management Function

AUSF Authentication Server Function

CHF Charging Function

CPU Central Processing Unit

CSV Comma Separated Values

E2E End-to-End

eNodeB Evolved Node B

EPC Evolved Packet Core

EPS Evolved Packet System

FQDN Fully Qualified Domain Name

gNodeB Next Generation Node B

HPLMN Home Public Land Mobile Network

HTTP Hypertext Transfer Protocol

IMSI International Mobile Subscriber Identity

ISP Internet Service Provider

JSON JavaScript Object Notation

LTE Long-Term Evolution


xiv | List of acronyms and abbreviations

NEF Network Exposure Function

NF Network Function

NG-RAN Next Generation Radio Access Network

NGAP NG Application Protocol

NRF Network Repository Function

NSSF Network Slice Selection Function

PCF Policy Control Function

PDU Protocol Data Unit

PFCP Packet Forwarding Control Protocol

PGW Packet Gateway

QoS Quality of Service

RAN Radio Access Network

SBA Service Based Architecture

SBI Service Based Interface

SMF Session Management Function

SMS Short Message Service

SSH Secure Shell

SVM Support Vector Machine

UDM Unified Data Management

UE User Equipment

UNIX Uniplexed Information and Computing System (originally UNICS)

UPF User Plane Function

VPLMN Visited Public Land Mobile Network


Chapter 1

Introduction

The 5G Core (5GC) is the core of the 5G mobile network and includes many
core functions, such as connectivity and mobility management, authentication
and authorization, user data management, and policy management, among
others [1, 2]. Ericsson has conducted development and extensive integration
and validation activities to deliver the industry’s best-performing 5G network,
including feature optimization. For these products, the most important way to
measure their performance is through end-to-end (E2E) latency measurements
[3]. However, many factors affect latency in 5G networks, and these factors
can be divided into two categories: signaling latency and payload latency.
Signaling latency can be understood as control plane latency, while payload
latency includes the process from User Equipment (UE) to gNodeB to the
User Plane. Although this may seem simple, it is more complex compared to
previous networks such as Long-Term Evolution (LTE), which also requires
lower latency.
Ericsson has integrated a test environment with corresponding software to
facilitate E2E testing. This allows a thorough representation of the entire end-
to-end process. Several specified International Mobile Subscriber Identities
(IMSIs) are used to perform different operations, such as registering a network
connection, authenticating, establishing a Protocol Data Unit (PDU) session,
switching locations or updating locations, and performing network slice
selection. Afterward, the records of each node can be obtained by capturing
packets, and it is possible to query the commands executed by each node to
retrace the entire process. Timestamps and Central Processing Unit (CPU)
load rates are also recorded. Thus, it can be verified that measuring latency
and monitoring load factors is feasible.
Therefore, it is necessary to explore the correlation between the network

1
2 | Introduction

load and the latency contribution of each 5GC network function. Once
we are able to monitor load levels and latency contributions, it will be
easier to understand which network size can provide the optimal end-user
latency, thereby enhancing the user experience. It’s worth noting that existing
latency analyses often treat 5GC as a part of the operator’s overall network,
but 5GC consists of many different network elements, each making distinct
contributions to the overall network [4].

1.1 Background
In the development of 5G networks, there are already more comprehensive
standards that serve as specifications within the industry. While 5G networks
continue to be the dominant discussion point in the industry [5], A. GUPTA
explores current hot topics such as 5G cellular network architectures [6],
large-scale multiple-input multiple-output technologies, and device-network
technologies. M. Skulysh et al. make improvements to 5G core gateways in
anticipation of placing some of their services in the cloud [7]. Y. Choi focuses
on Network Functions (NFs) in core networks [8], working on efficient data
path configuration when UEs are on the move.
Since the entire infrastructure is based on NFV, SDN, and network
slicing, it is necessary to understand articles related to these technologies.
Mijumbi et al. introduce the current use scenarios of NFV technology, and
it can be found that the combination of NFV and cloud technology is a
trend, which is more conducive to the flexible deployment of these NFs by
companies [9]. Michel. S et al. systematically introduce how SDN and
NFV technologies are combined to become a new paradigm, and proposes
a new integrated architecture [10]. Y. Wu et al. introduce the common
scenarios of network slicing for industrial IoT, analyzes the architecture, and
are helpful for understanding the integration of an NF across multiple network
slices in this thesis [11]. A. Khatouni et al. used three supervised learning
algorithms, Logistic Regression, Support Vector Machines, and Decision
Tree, for classification tasks to predict large amounts of real delay data [12],
but were limited to a 4G network model. In addition, Z. Hou also proposed
an algorithm for optimizing communication quality prediction in the case
of ultra-reliable and low-latency communications [13]. For CPU occupancy
prediction, M. Duggan et al. attempted to use recurrent neural networks for
training [14]. Thus, there is still room for development in combining the key
modules of the 5G core network with latency or CPU considerations.
Introduction | 3

1.2 Problem Statement


During this work, the question is raised as to whether it is possible to predict
the latency contribution of each 5GC network feature individually, and how
they depend on the network load. When there is only one end user, the latency
in the network is longer than a moderate level of load. But we lack knowledge
of the correlation between network load and latency contribution for each
5GC network function. The possibility of monitoring load levels and latency
contributions would allow us to understand the amount of cloud resources
needed to provide the most desirable end-user latency. So there are three main
questions to be answered during this thesis work:

1. Is it possible to predict the latency contribution from each 5GC network


function, and analyze their dependency from the network load?

2. What is the correlation between 5GS E2E latency and the network load?

3. Is there a load level for which we will have an optimal latency E2E?

1.3 Goals
The goal is to gather and analyze data from the 5GC network to determine if
a correlation can be made between latency and CPU/Memory utilization for
different end-user use cases and for various levels of background load. The
work also includes developing a method and a series of scripts for extracting
latency data for a set of 5GC network functions.

1.4 Research Methodology


Latency is measured using Wireshark and software developed in-house by
Ericsson. This approach aids in tracing the next hop of signaling and
examining the relationship between NFs. Part of this process involves the
use of bash or Python scripts to automate reading and filtering. Given that
the overall transmission protocols involved are reasonably predictable, and the
information carried by the signaling is essentially fixed, we can apply filters
to select useful packets or those that contribute to error analysis, storing them
separately.
Once stored, Python programs are used to calculate the time delay
between signals, which includes both horizontal delay (delay between NFs)
4 | Introduction

and vertical delay (delay within NFs). This step involves identifying the
inherent processing delay of a specific NF participating in the PDU session
establishment. Combining these delays gives the total delay of that NF. With
the original data at hand, statistical analysis of the delay becomes feasible.
Concurrently, in situations of high server load for a given network function, the
collection of failed packets and the packets indicating server load protection
activation allows not only an understanding of server operation under high
load conditions but also aids in preventing such situations.
Automatic acquisition of latency and CPU/memory utilization data is
based on historical data from servers across various network functions. The
data is formatted as JSON files, which facilitates analysis using Python,
ultimately producing CPU usage diagrams during latency testing.
Equipped with latency and CPU utilization curves, a correlation can
be established, generating corresponding scatter plots and facilitating curve
fitting. Polynomial fitting is used to obtain analytical approximation.
Typically, product testing often occurs under a specific load level, meaning
testing is performed with a fixed number of users due to the broad
nature of product testing. This thesis emphasizes a representative PDU
session establishment case involving several common network functions. By
conducting tests with diverse user counts, the depth of the product can be
demonstrated, thereby contributing valuable data to the product’s overall
evaluation.

1.5 Contributions
This study introduces a model that examines the relationship and behavior
between load rate (or user count) and network function latency. Firstly,
valuable information from different signals and servers is extracted, with key
data preserved in a CSV format. Secondly, the implementation of automated
analysis for different information effectively reduces the testing workload to
some extent, enabling more efficient analysis of the generated packets from
Wireshark. Finally, data fitting is performed, and representative data points
are analyzed to construct the model. These steps contribute to understanding
which network size can provide the best end-user latency, enhancing the
overall network performance and user experience.
Introduction | 5

1.6 Structure of the Thesis


The first chapter of this thesis provides an introduction, while the second
chapter explains the core network functions and scenario diagrams involved,
as well as the related software. Chapter 3 describes the research methodology,
i.e., how to collect and preprocess data, extract feature engineering, divide the
dataset, select a suitable model and train it, and finally evaluate the model.
Chapter 4 presents the analysis of the results obtained after following the
process in Chapter 3. Chapter 5 is a summary of the results.
Chapter 2

Background

2.1 5GS
This section first briefly introduces the 5G system, then moves on to the 5GC
model discussed in this thesis, and finally provides a brief introduction to NFs
other than AMF and SMF.
The 5GS provides system-wide functions such as Mobile Broadband and
Voice Services which consist of UEs, the RAN, and the 5GC. 5GS supports
roaming scenarios, non-roaming scenarios, and interworking with legacy EPC
networks. Figure 2.1 shows its architecture.

Figure 2.1: 5G system architecture.

For each part of 5GS, the user plane and the control plane are separate

6
Background | 7

in the 5GS. The user plane carries user traffic and the control plane carries
signaling traffic in the 5GS. The separation of the user and control planes
enables each plane resource to be scaled independently. The 5GC supports
parallel access to local and centralized services. For example, the control
plane can be centralized while the user plane with low-latency services can
be distributed in local data centers.
Another key trend of 5GS is deploying Network Function Virtualization
(NFV) services and Software Defined Infrastructure (SDI) on the cloud. This
is also implemented in the practical system of this thesis. NFV improves
the flexibility of the network by separating network functions from dedicated
hardware and implementing these functions in virtual machines or containers.
This allows network service providers to deploy and scale services on cloud
infrastructure rapidly. Meanwhile, SDI enables the deployment of network
services in a more dynamic, efficient, and cost-effective manner by abstracting
the hardware layer, making it programmable and capable of being manipulated
as per the application or system needs. With these two technologies as the
fundamental architecture, 5GS is moving towards a service-based structure.
This architecture allows network functions and services to be software-driven,
flexibly deployed, and managed on the cloud, thus achieving efficient operation
of the network and optimizing resource utilization.
This thesis mainly focuses on 5GC. So the part in UE and RAN are
introduced briefly.

2.1.1 UE
UE refers to a user device, such as a cell phone, a modem, or a laptop, that is
capable of performing various functions, such as making phone calls, surfing
the internet, sending emails, and more. In 5G networks, UE can also support
industrial [15], traffic [16], and medical applications [17].
UE communicates with RAN by sending and receiving data, and it serves
as both the initiator and the terminal for a service. If UE needs to initiate
a request to the core network, it goes through a negotiation process between
RAN and the core network. The basic process can be described as below:

• UE establishes a wireless signal connection with the nearest base station


to complete the basic communication of signaling and data transmission.

• In order to ensure the security and data protection of the communication


process, UE needs to perform identity authentication and security
negotiation with the base station, so that encryption and authentication
8 | Background

operations can be performed during subsequent communication


processes.

• UE needs to perform capability negotiation with the base station to


determine its communication capabilities and the types of services it
supports. This information is important for subsequent communication
processes and resource allocation.

• UE sends a connection request to RAN, including the required service


type, bandwidth requirements, and other information, so that RAN can
schedule and manage it.

• RAN allocates wireless resources for UE and manages it through


scheduling to ensure communication quality and resource utilization
efficiency.

• Finally, UE connects to the core network of the mobile communication


network to start voice calls, data transmission, and other services.

2.1.2 RAN
RAN (Radio Access Network) includes base stations (eNodeB in LTE and
gNodeB in 5G), wireless channels, and other facilities. RAN provides wireless
access and data transmission services to UE, manages wireless resources
and security, manages and maintains base station equipment, and optimizes
and improves the network [18, pp. 82-84]. As previously mentioned, UE
interacts with RAN. After UE is connected to the mobile communication
system through RAN, RAN decodes, encodes, and converts signals sent by
UE and transmits them to the core network.
For the part related to UE, RAN provides wireless access and
data transmission services and also requires security management and
authentication of mobile devices, in addition to allocating wireless resources.
RAN itself also has functions such as cell partitioning, power control,
interference management, etc.
The interaction between RAN and the core network involves the following
processes:
• Decoding the information sent by UE, and then encoding it for error
checking and data recovery.

• Select the appropriate protocol based on the information sent by UE so


that the core network can process and forward it.
Background | 9

• Forwarding the route. RAN needs to route signals sent by UE to the


correct core network node for forwarding and processing. These nodes
include AMF, SMF, UPF, etc.

2.1.3 5GC
5GC is defined as service-based architecture. The system functionality in
the service-based architecture is achieved with network functions that provide
service access to each other. Network functions are components of the network
infrastructure with specific functional behavior, such as mobility and session
handling. Network functions are made up of network function services [18,
pp. 79-82]. The discussion of AMF and SMF will be independent sections
since it is our topic.
Interactions between network functions are done in two ways. One is
reference point representation, and the other is service-based representation.
Reference point representation uses a defined point-to-point reference to
represent an interface between two network functions. In service-based
representation, network functions in the control plane act as service providers
by providing services through their service-based interfaces to other network
functions. When these services are used by other networks, those network
functions are acting as service consumers.
10 | Background

Figure 2.2: 5G Core Architecture

2.2 Network function in 5GC


2.2.1 AMF
The AMF manages the registration and mobility of UEs in the 5GS. Within
the 5GC, the AMF coordinates signaling between UEs and other network
functions. The AMF also provides service operations for handling N2
point-to-point messages between the RAN and other network functions [18,
pp. 82-83]. There are two registration management states in AMF: RM-
REGISTERED and RM-DEREGISTERED.
As Figure 2.4 shows there are some main interfaces which are:

N1 Indirectly connecting between UE and AMF by tunneling through the


gNB and through the N2. NAS signaling messages are used during the
transmission.

Namf AMF exhibit and provide services to other network functions, such as
the SMF and other AMFs.
Background | 11

N2 Between the AMF and the RAN connected to the UE. NGAP messages
are transferred between the AMF and the RAN.

Figure 2.3: AMF

2.2.1.1 RM-REGISTERED: Registered Status


A UE enters the RM-REGISTERED state after a successful registration
procedure and can receive services that require registration in the network.
A UE in the RM-DEREGISTERED state is not registered and the UE context
in the AMF holds no valid location or routing information for the UE. The
registration procedure is performed when a UE needs to perform initial
registration to the 5GS, mobility registration update when the UE moves to
a new tracking area or updates its capabilities or parameters, or periodic
registration update.

2.2.1.2 RM-DEREGISTERED: Deregistered Status


A UE in the RM-DEREGISTERED state is not registered. The UE context in
the AMF holds no valid location or routing information for the UE. The AMF
cannot reach a UE in the RM-DEREGISTERED state, as the UE location is
not known. In the RM-DEREGISTERED state, some UE information can be
stored in the UE and AMF. After a successful Deregistration procedure, the
UE registration management state is changed to RM-DEREGISTERED in the
AMF.
12 | Background

2.2.2 SMF
The SMF sets up and manages sessions according to the network policy.
Session management procedures establish and handle connections between
the UE and one or more data networks. Through SMF services, consumer
network functions control PDU session events. The SMF selects and controls
the UPF, and also manages traffic steering at the UPF to route traffic toward
the final destination. It also supports charging PDU sessions [18, pp. 81].
SMF supports session management procedures for 5GS, EPS, and
untrusted Non-3GPP networks. For PDU sessions, it supports 5GS and PDN
connections in the EPS.

Nsmf SMF offers the PDU session service which allow other NFs to establish,
modify and release the PDU sessions over the Nsmf SBI to the AMF.

N4 Through the N4, the SMF controls the UPF by creating, updating, and
removing the N4 session context in the UPF.

Figure 2.4: SMF

2.2.3 UPF
UPF handles the user plane path of PDU sessions and supports packet routing,
inspection, and traffic reporting [18, pp. 84-85]. For this thesis, its function
Background | 13

of processing and forwarding the user packets, and the ability to control and
optimize data traffic are important.
As Figure 2.5 shows the UPF supports some of the important interfaces:

N3 N3 interface connects the UPF to gNodeBs, allowing the transport of


user data.

N4 N4 interface connects the UPF to SMFs. The N4 interface is used for


session management and other control signaling.

N6 N6 interface connects the UPF to external Data Networks, allowing the


transport of user data.

Figure 2.5: UPF

2.2.4 Other network functions


And as the Figure 2.2 show, there are some network functions in 5GC network
which are:

AUSF Provides UE authentication service.

NRF Maintains the profile of available network function instances.

UDM Supports registration management to network functions that serve the


UE.
14 | Background

PCF Provides a unified policy framework for session and mobility


management. And for sessions, the main responsibility of PCF is to
manage the quality of service and interaction with the IP Multimedia
Subsystem for voice calls.

NSSF Supports the selection of network slices and AMF sets for the UE.

NEF Securely exposes network functions capabilities and events to third


parties.

CHF Converges online charging and offline charging functionalities.

2.3 Scenario select


2.3.1 Service select
With the introduction of various network functions, there are multiple
signaling scenarios involved when establishing and maintaining end-user
service in the 5G system. These different services are an essential part of
the 5G network. They utilize different network functions, protocols, and
ports, making these seemingly simple but actually complex functions possible.
Therefore, for E2E testing discussed in this article, it is more appropriate to
choose to focus on one of these which represents a central function in 5G core -
PDU session establishment, which also involves the central network functions.
In this process, a PDU session is established between the UE and the network,
allowing the UE to start sending and receiving data. The PDU session includes
multiple PDU session containers, each corresponding to a QoS flow, which
determines the priority and rate of data transmission.
PDU Session Establishment involves network functions such as AMF,
SMF, and UPF, and it involves various protocol layers, especially application
layer protocols (like HTTP, NGAP, etc.). It is also a key indicator of network
performance as it affects the speed at which user equipment begins to send and
receive data. Therefore, selecting this scenario for E2E testing is conducive to
assessing the overall network performance.

2.3.2 PDU Session Establishment


Figure 2.6 shows the PDU session establishment process, it occurs in the UE-
initiated PDU Session Establishment procedure. A PDU is a data transfer unit
that is used to pass information between network layers. The meaning of this
Background | 15

scenario is to enable the establishment of a connection between the UE and all


of the Internet Service Providers (ISP) for data transfer. Once this connection
is in place, other applications and services can be ensured to work properly.

Figure 2.6: PDU Session Establishment

2.4 General Tools


2.4.1 Wireshark
Wireshark is an open-resource tool for monitoring networks and capturing
traces of various protocols. It is equipped with powerful and reliable functions
which becomes nearly the first option for capturing packets. Due to its different
platform versions, it can run on different systems including macOS, UNIX/
UNIX-like, and Microsoft Windows [19].
Wireshark is a versatile packet capture tool with numerous useful features.
It can capture packets in real-time from various interfaces, read multiple
formats including Tcpdump, and filter and retrieve specific packets as needed.
The protocol information for each packet is detailed, and users can save
individual or multiple packet files in various formats, such as CSV or JSON.
Other than that, it has a terminal-oriented version called TShark, which
usage is the same as Wireshark [20]. The thesis will use it in some of the
scripts since it is a more powerful tool that can be combined with the command
line.
16 | Background

In this thesis, we capture packets from network functions (NFs) and


analyze the results using Wireshark with filters to isolate packets relevant
to the different scenarios being analyzed. After verifying that the results
meet the desired performance criteria, we store the filter statements and
process multiple PCAP files using scripts that leverage TShark for packet
processing. TShark also exports csv files with key information, such as IP
source, destination, etc., easily marked for analysis.

2.4.2 PAC Manager


PAC Manager is an open-source SSH/Telnet/Serial connection management
tool developed using Perl programming language and GTK GUI development
kit [21]. It serves as a free alternative to similar tools such as PuTTY,
SecureCRT, and SSHMenu.
An important feature of PAC Manager is the ability to create SSH tunnels.
When engaging in network communication, such as sending and receiving
emails, chatting on instant messengers, or using internet banking, there may be
sensitive data that we do not want to be passed in clear text, or some resources
that are not directly accessible due to network structure and firewalls. SSH
port forwarding can be used to achieve this, which involves sending data that
would otherwise be accessed using other TCP ports through the port occupied
by the SSH connection, and this process is often referred to as "tunneling" due
to the encryption and decryption of data that takes place [22].
In addition, PAC Manager’s clustering function is useful when managing
multiple remote hosts within a network environment. This function eliminates
the need to execute the same commands on each host, saving time and effort.
In this thesis, PAC manager is commonly referred to as is a tool which has
predefined SSH connections to the network functions (NFs) in the lab. Using
PAC is a convenient way for users to organize their connections and reach 5G
Core NFs. And there are some Jump servers on it.
Chapter 3

Research Methodology

This chapter provides an overview of the overall methodology. Firstly, it


continues the discussion on the components of latency for each network
function in the context of PDU session establishment, as these components
form the end-to-end latency across the network functions. Secondly, there is
an introduction to the data collection methods. It begins with an overview of
the experimental environment, followed by an example analysis of the recorded
data. Once the environment and examples are established, the data collection
methods are further elaborated upon. Following that, the chapter covers the
analysis methods. With the collected records, an analysis is performed to
determine the metrics that need to be measured and calculated in order to
provide a description. The fitting process follows a similar approach. Finally,
there is a discussion on predicting potential outcomes.

3.1 Component of PDU Session


Establishment
The overall principle of latency testing is to assess the processing capabilities
of individual network functions in a vertical manner, without considering the
transmission time between elements horizontally. In the context of round-trip
time, the focus of the testing is on the processing delay. Since the delivery of a
service typically involves the collaboration of multiple network functions, the
total processing delay of each element is of concern in this scenario.

17
18 | Research Methodology

3.1.1 Contributions in AMF


The latency of the AMF consists of two components, which involve the
following interfaces and their corresponding connections, targets, and protocol
types as shown in the table 3.1:
Interface Target Protocol
N1 UE NAS
N2 RAN NGAP
N11 SMF HTTP2

Table 3.1: AMF interfaces, targets, and protocols.

With this basic understanding, the latency for AMF can be illustrated as
shown in Figure 3.1.
For the AMF in the graph, there are two main segments of latency:

3.1.1.1 Latency 1
This segment starts from the AMF receiving the "PDU Session
Establishment Request" forwarded by the RAN to the AMF sending the
"Nsmf_PDUSession_CreateSMContext Request" to the SMF, marking the
beginning of the SMF section.
During this process, the AMF goes through the following steps:
1. Validate the integrity and accuracy of the received message. This
includes the RAN passing a series of UE information, such as UE
Radio Capability and UE Radio Capability for Paging, as well as initial
scenario settings, among others. This not only facilitates resource
allocation behavior by the AMF but also allows the AMF to check the
basic information of the UE.
2. Allocate resources for the upcoming session, such as the Routing
Indicator containing routing information for forwarding the session to
the appropriate destination, specifying the data network to which the
PDU session belongs, selecting the corresponding network slice (S-
NSSAI) assigning a unique ID to the session, specifying the request
type, and assigning the corresponding SMF, DNN and UE related
information such as SUPI (IMSI included), among others.
3. Communicate with the selected SMF to request the creation of the
corresponding SMContext to support the management and control of
this session.
Research Methodology | 19

3.1.1.2 Latency 2
This segment includes the process from the AMF receiving the
"Namf_Communication_N1N2_MessageTransfer Request" from the SMF
to sending the "Namf_Communication_N1N2_MessageTransfer Response"
to the SMF and sending the "PDU Session Resource Setup Request" to the
RAN.
During this process, the AMF goes through the following steps:

1. Sending a response to the SMF: The AMF parses the received request
message and generates the corresponding response message. The
request from the SMF includes an N1 PDU Session Establishment
Accept and an N2 PDU Session Resource Setup Request Transfer
message. By sending this response message to the SMF, the AMF
notifies the SMF of the reception and processing status of the message.

2. Sending a request to the RAN: After receiving the request from the SMF,
the AMF needs to communicate with the RAN to establish the resources
for the PDU Session. The AMF sends an N2 PDU Session Resource
Setup Request message to the RAN, including the N2 SM information
and NAS message. This request is used to set up an Access Network
resource.

Figure 3.1: AMF latency contributions.


20 | Research Methodology

3.1.2 Contributions in SMF


The latency of the SMF consists of two components, which involve the
following interfaces and their corresponding connections, targets, and protocol
types as shown in the table 3.2:

Interface Target Protocol


N4 UPF PFCP
N10 UDM HTTP2
N11 AMF HTTP2

Table 3.2: SMF interfaces, targets, and protocols.

With this fundamental understanding, the latency for SMF can be


illustrated as shown in Figure 3.2.

3.1.2.1 Latency 1
This segment covers the period from when the AMF sends the
"Nsmf_PDUSession_CreateSMContext Request" to the SMF until the
SMF sends the "Nudm_SubscriberDataManagement_Get Request" to the
UDM.
For the SMF, this process involves the following steps:

1. Parsing the received information: In this stage, the SMF receives


a message from the AMF that contains relevant information for the
session, such as the SUPI and PDU session ID mentioned earlier.

2. Sending a request to the NRF for UDM service instance


discovery: The SMF queries the NRF to discover available
Nudm_SubscriberDataManagement service instances. The NRF
provides a list of discovered instances along with their IP addresses.
Upon receiving the NRF’s response, the SMF can proceed to send
requests to the UDM.

3. Sending a GET request to the UDM: The SMF forwards the UE’s SUPI,
S-NSSAI, and DNN parameters to the UDM. The SMF aims to retrieve
subscriber subscription data and configuration information from the
UDM to support subsequent session control and service provisioning.
Research Methodology | 21

3.1.2.2 Latency 2
This segment includes the period from when the UDM sends the
"Nudm_SubscriberDataManagement_Get Response" to the SMF until the
SMF sends the "Nudm_SubscriberDataManagement_Subscribe Request"
back to the UDM.
For the SMF, this process involves the following steps:

1. Receiving user data and relevant information from the UDM: The SMF
may perform updates and configurations to reflect the subscriber’s latest
status in the session.

2. Initiating the subscription session request: The SMF sends a Subscribe


request to the UDM to subscribe to notifications of subscription
data changes. This operation triggers the subscription data change
notification procedure using the HTTP POST method. The request
includes the UE’s SUPI and the subscription data type previously
provided by the UDM.

3.1.2.3 Latency 3
This segment includes the period from the UDM’s response
"Nudm_SubscriberDataManagement_Subscribe Response" to the SMF
until the SMF sends the "Nsmf_PDUSession_CreateSMContext Response"
to the AMF and the "PFCP Session Establishment Request" to the UPF.
For the SMF, this process involves the following steps:

1. The SMF becomes aware that there has been a subscription update at
the UDM.

2. After receiving the subscriber data update and updating


the session management context, the SMF sends a
Nsmf_PDUSession_CreateSMContext Response message to the
AMF. This message notifies the AMF about the creative result of the
SMContext and includes the SMF’s own Fully Qualified Domain Name
(FQDN) and IP address for AMF identification.

3. The SMF queries the NRF again, this time for UPF selection
based on local configuration. The request is made to discover
Npcf_SMPolicyControl service instances. Upon the NRF’s response,
the SMF, in coordination with the PCF, selects a QoS flow based on
the required default policies, charging rules, and Quality of Service
22 | Research Methodology

(QoS) parameters. Once selected, the SMF sends a PFCP Session


Establishment Request message to the UPF to allocate the Core Network
tunnel.

3.1.2.4 Latency 4
This segment spans from when the UPF sends the "PFCP Session
Establishment Response" to the SMF until the SMF sends the
"Namf_Communication_N1N2_MessageTransfer Request" to the AMF.
For the SMF, this process involves the following steps:

1. The SMF receives the response from the UPF, confirming the
establishment of relevant tunnels and session.

2. The SMF requests the RADIUS server to start the accounting session
and receives a corresponding response.

3. The SMF also queries the NRF and obtains a list of available
Nchf_ConvergedCharging service instances for charging management.
The SMF establishes communication with the CHF and starts billing
data.

4. Finally, the SMF sends a Namf_Communication_N1N2_MessageTransfer


Request message to the AMF, including the N1 PDU Session
Establishment Accept and N2 PDU Session Resource Setup Request
Transfer messages.

3.1.3 Contribution in UPF


The latency of the UPF consists of two components, which involve the
following interfaces and their corresponding connections, targets, and protocol
types as shown in the table 3.3:

Interface Target Protocol


N4 SMF PFCP

Table 3.3: UPF interfaces, targets, and protocols.


Research Methodology | 23

Figure 3.2: SMF Latency Contributions

3.1.3.1 Latency
This segment spans from when the SMF sends the "PFCP Session
Establishment Request" to the UPF until the UPF sends the "PFCP Session
Establishment Response" to the SMF.
For the UPF, this process involves the following steps:

1. Upon receiving the request from the SMF, the UPF allocates the
necessary resources for the session based on the provided QoS and
routing information. This ensures the availability of resources and
facilitates the establishment of tunnels for the session.

2. The UPF establishes the user plane path for the PDU session based on
the given information. This involves setting up a unidirectional tunnel
from the gNode to the UPF to facilitate the flow of user data.

3. The UPF updates the tunnel information to the SMF, including its IP
address and other relevant parameters. This information is crucial for the
SMF to identify and manage the session, as well as to facilitate further
control and communication with the UPF.
24 | Research Methodology

Figure 3.3: UPF Latency Contribution

3.2 Collection method


This segment provides background information on the data collection process.
It briefly describes the system setup environment and explains how different
load levels and data from different network functions were obtained and
processed.

3.2.1 System environment descriptions


Ericsson has a dedicated system for deploying and managing cloud services,
allowing all NFs to request resources and be deployed on virtual machines.
The entire cloud infrastructure is managed and orchestrated using Kubernetes.
The operating system used is Linux. Traffic Generator is deployed in physical
servers and also its operating systems are based on Linux.

3.2.2 System working mechanisms


The entire testing system can be said to consist of NFs and Traffic Generators.
Under their coordination, the testing of these NFs is completed.
Research Methodology | 25

3.2.2.1 Traffic Generator


The primary goal of the Traffic Generator is to simulate the behavior of real
UEs and RAN signals in a 5G network. It can simulate millions of users for
5GC nodes (AMF, SMF, UDM, and UPF) and generate signaling (control
plane) and payload traffic (user plane) on functional interfaces. The Traffic
Generator supports multiple protocols and simulates the correct behavior of
interfaces towards real nodes. In simple terms, it can be seen as a collection
of multiple real users.
When simulating a UE, the Traffic Generator assigns an IMSI to identify
the UE. Users are differentiated only as 4G or 5G users since they may face
different testing scenarios. And also it can simulate the signal in RAN which
enable the AMF to process the signaling and forward it to other NF such as
SMF.
The Traffic Generator can simulate up to one million users at a time, with
the ability to define specific user proportions as active users. However, in order
to simulate real-world scenarios, not all defined users, for example, if 100,000
users are defined, will be active and executing sessions. There will be a certain
fluctuation, and there may be fewer than 100,000 active users. One foreseeable
consequence of this is the fluctuation in load on the network functions.

3.2.2.2 NFs
Each network function has logs to record the UE activities generated by the
Traffic Generator, usually in XML format. When a portion of the users
becomes active on the Traffic Generator side, these users can be considered
as a nearly fixed load appearing in each network function. By changing the
number of active users, the load on each network function can be controlled.
However, it is not feasible to record logs for all active users in the background
as it would impose a heavy load on the server and generate a lot of unnecessary
data. Therefore, a specific UE can be registered in each network function,
which allows displaying only the information of that UE across the network
functions. Additionally, the actual number of background users is much larger
than that specific UE, and multiple sets of servers in a network function are
used to share the load of such a large number of users. Hence, the impact of
this specific UE on the overall CPU can be considered negligible.
Once this UE is registered on each network function, tests can be
performed on the Traffic Generator side, with the ability to change the number
of test iterations and the test content. Enabling the monitoring functionality
on each network function allows obtaining the corresponding records.
26 | Research Methodology

3.2.3 Collecting process


First, selecting an IMSI was registered for 5G testing.
Initially, 5% (50, 000 users) were chosen as the testing interval in the Traffic
Generator. This selection is based on the fact that the load range of each
network function may fluctuate, and these 50, 000 users can cover the majority
of this fluctuation range. However, this is not absolute, as the 50, 000-user
interval can sometimes be insufficient to cover the range when the load is low.
Therefore, it is recommended to increase the testing interval with a smaller
number of users, such as 20, 000 or 30, 000, to ensure coverage during low-
load situations.
Next is the number of test iterations and the corresponding time. Each
test is expected to last approximately five seconds. This duration allows for
completing the test content and releasing the resources afterward. However,
it is expected that there will be failed test cases during the testing process.
Therefore, a large test point is set at 1, 000 iterations, which can be completed
in approximately one and a half hours for a large-scale test. However, in high-
load scenarios, it may take approximately 7 − 8 hours to complete 200 − 300
valid tests. Thus, the testing time needs to be extended accordingly.

3.3 Measurement infrastructure


3.3.1 Example message
As shown in Figure 3.4, an example message from one network function is
listed here. This XML file contains the following information:

Figure 3.4: Sample XML log.


Research Methodology | 27

• Message ID

• Timestamp

• UE information (i.e., IMSI)

• Ports and message types traversed by the message

• Address and port of the source network function

• Address and port of the destination network function

• Message protocol

• Encrypted message content

Figure 3.5: Example of NF services.

As for message types, 3GPP has defined the contents as shown in Figure
3.5. All message types for AMF and SMF discussed in this document are
included in the Table 3.2 and Table 3.1.

3.3.2 Example CPU log


As shown in Figure 3.6, an example CPU log from one network function is
listed here. This JSON file contains the following information:
This JSON file includes some key information, where additional key-value
pairs under the "metric" are used for overall management. This file is just an
28 | Research Methodology

Figure 3.6: Sample CPU log.

instantaneous log, but formal logs can span a longer duration, resulting in more
key-value pairs in the "value" section for better record-keeping. The useful
information for us here includes:

• Container: Contain the network element’s name.

• Pod: Contribute to the overall CPU utilization of the SMF network


element.

• Value section contains a pair of values: the test time and the utilization
rate on this server.

3.3.3 Collect and Preprocess Message


This part consists of two parallel components. The records regarding latency
and CPU are stored on separate SFTP and SSH servers for each network
Research Methodology | 29

element, and the file formats are also different. The overall execution flow
is illustrated in figure 3.7.

Figure 3.7: Collect and preprocess message.

3.3.3.1 Latency collecting and preprocessing


The interval between records for each log is set to 5 minutes per document,
which facilitates browsing and opening the records. And the process is shown
in figure 3.8.
The input consists of a series of pcap files. In addition to that, the NF
type and the IMSI to be traced are required as inputs because there may be
more than one IMSI to trace in the test environment. Then, the log_decoder
is called. For this part, since the encryption methods are different, there are
existing programs that can be invoked for decoding, and the decoding process
directly produces corresponding pcap files. The next step is to examine the
information in these Wireshark packets, such as the start and end time of the
packets, whether the tracked IMSI includes the number entered as input to
the program, and whether the network element type matches the expected
input to be processed. If all of these match, the next step is to merge and
process the files to create a consolidated file with a filename like PDU-
AMF_10_0520_262800000499999.pcap. However, if any of these conditions
do not match, the process returns to the first step to re-enter the relevant
30 | Research Methodology

Figure 3.8: Latency’s collecting and preprocess.

information. Afterward, there are some editable filters kept in CSV files. The
filters are based on strings, message types, protocol types, key-value types,
etc., from the relevant information for filtering. Irrelevant or error-related
packets are filtered out. The useful information is stored in a file like PDU-
AMF_10_0520_262800000499999.csv, while the irrelevant or error-related
packets are stored in PDU-AMF_10_0520_262800000499999_Nonset.csv.
The stored information includes the following:

• Packet relative sequence

• Packet relative time

• Source destination
Research Methodology | 31

• Target destination

• Protocol

• Information for the service

3.3.3.2 Recording the CPU utilization and preprocessig


The log of CPU utilization can be customized for a specific duration. The
overall collection process is illustrated in Figure 3.9.

Figure 3.9: CPU utilization’s collecting and Preprocess.

In this section, the focus is on processing the JSON files. First, the
corresponding file is read, similar to what was shown in the previous figure
3.6. The "metric" corresponds to a dictionary containing various metrics,
and the "value" is read as a list containing a timestamp and a value. Next,
the dictionary is iterated through to find the key "workload_name". If the
corresponding element is found, the corresponding "value" is stored in a
new list. If it is not found, the program exits. After the iteration, there are
usually 8 groups of servers for AMF and 5 groups for SMF. They have the
32 | Research Methodology

same timestamps and slightly different CPU utilization values, but due to
load-balancing strategies, the CPU utilization is generally similar within each
group. Therefore, the average CPU utilization is calculated, and the previous
timestamps are converted to relative time for ease of merging with the latency
file to calculate CPU utilization at each moment. This list is then stored in a
file with a name like: AMF_5.log.

3.4 Statistic and modeling


The preprocessing step described in the previous section is followed to
calculate the latency for each component and introduce additional metrics for
analyzing the overall results. Furthermore, the rationale behind modeling and
selecting polynomial fitting is explained.

3.4.1 Statistical Analysis


This step further processing of the files is carried out, classifying them into
valid and invalid samples, and conducting respective statistics. Subsequently,
the CPU log processed in the previous step is appended to the valid samples,
facilitating further analysis and plotting. And the total process is shown in
Figure 3.10.

Figure 3.10: Statistic process

3.4.1.1 Latency’s statistic process


For the latency analysis, we obtained CSV files containing valid samples and
invalid samples in the previous step. However, the valid samples only filter
Research Methodology | 33

Figure 3.11: Latency’s statistic process

out the appropriate service content, protocol, and specific message IDs. There
may still be missing files in this CSV. Therefore, the purpose of this step is to
filter out the missing files and create a separate statistics file for them. And the
process is as Figure 3.11 shown.
The entire file is read and stored as a dictionary. First, filtering is performed
based on message type, message header number, and protocol type. For
34 | Research Methodology

Figure 3.12: Latency’s statistic process example.

example, as shown in figure 3.12, for the message "HEADERS[6383]: POST


/nsmf-pdusession/v1/sm-contexts," which belongs to the AMF transmitting
SM-Contexts message to the SMF, if this is a normal message, there must be
a corresponding response message with the same header number of 6383, and
both are of HTTP2 protocol type. Indeed, there is a "201 Created" message as
a response from the SMF to the AMF. Similarly, there are some individual
uplink NAS messages and downlink NAS messages, which can be listed
separately. However, most message services have requests and responses, and
they can be identified based on this.
Once these are confirmed as potentially valid samples, since a successful
sample message from a network element can be considered fixed, the relative
order can be determined based on the message flow, as shown in figure 3.1,3.2
and 3.3, assigned values. For example, in the example mentioned earlier, the
UL NAS transport message is assigned a sequence number of 1, the POST
message sent from the AMF is assigned a sequence number of 2, and the
response is assigned a sequence number of 3, and so on. This way, these
messages are marked.
The next step is to remove missing or redundant messages. First, starting
from the beginning, if the sequence does not start with the UL NAS transport
message sent from the RAN to the AMF, then these messages are directly
added to the "missing" dictionary until the UL NAS transport message is
encountered. Next, all messages with sequence number 1 are identified, sliced,
and converted into a list. If the length of the list is greater than 10, there must
be redundant messages. In this case, the duplicate messages at the end should
be excluded, and the next check is performed. If the length of the list is less
than 10, then it indicates missing messages. If the sequence number happens
to be 10, it checks if the order of these messages is sequentially increasing and
their relative times are smaller than the next sequence number 1 message. The
longer messages mentioned earlier with a length greater than 10 will also go
through this check after removing duplicate segments. In addition, the count
of missing messages is recorded for later statistical analysis.
After these removal steps, valid samples can be obtained. At this point,
the list is appended to a new dictionary, forming the first page of the results,
"Result samples." The time intervals between them, such as between order
Research Methodology | 35

2 and order 1 for AMF, and between order 7 and order 4 for the second
interval, are marked as separate columns. The total latency is calculated as
well. Additionally, basic statistics of the samples are calculated, including the
number of valid samples, maximum value, minimum value, and average value.
Especially for SMF, where TCP delay ACK exists, calculating the percentile
values of these data can also reflect their characteristics. These statistical
values are included in the second page of the results, "Statistic result."
Similarly, at the beginning, another dictionary is used to read the invalid
samples. The basic failure type in the invalid samples is HTTP fault header.
Therefore, a list of HTTP fault headers is prepared and iterated over. If an
HTTP fault header is found, it is counted and stored in a separate dictionary.
If not found, the process exits. In the end, the error samples are included in
the "Fault Sample" page, and the missing samples are included in the "Missing
Sample" page, both of which are part of the "Miss and Fault statistic" page.

3.4.1.2 CPU utilization’s statistic process

Figure 3.13: CPU utilization’s statistic process

After obtaining the results, it is only necessary to combine the


corresponding CPU log with the obtained latency results. The CPU log is
refreshed approximately every half a minute, and the values remain the same
within that half-minute interval. Therefore, by using the precalculated relative
time from both sources, the CPU utilization values within the corresponding
relative time interval can be added to the valid latency results and displayed in
a separate column.
36 | Research Methodology

3.4.2 Model construction


When considering the process of establishing a model, we need to fully
consider the actual situation of the problem. The network elements discussed
in this article contain a relatively small number of variables and have a certain
degree of certainty. In this case, we usually do not need overly complex
models. The polynomial model is simple in form, easy to understand and
operate, and can fit various types of data. Therefore, it is reasonable to choose
the polynomial model for fitting.
According to the network elements discussed, they all ultimately face the
situation of high load. For AMF, there is the phenomenon of slow start under
low latency, and for SMF, there is always exist TCP Delayed Acknowledgment,
and this phenomenon is more obvious under low load. These situations
may cause data changes not simply following a linear relationship. The
characteristic of the polynomial model is that it can fit non-linear data
relationships. Therefore, in this case, polynomial fitting can better describe
and predict the complex relationship between these variables.
In this case, overly complex models may introduce unnecessary errors and
cause overfitting. The polynomial model provides a good balance between
complexity and accuracy, so this article chose polynomial fitting to describe
these scenarios.
Chapter 4

Results and analysis

In this chapter, we first present overall results such as delay-occupancy. Later


we follow by a detailed analysis such as segmented analysis of the fitting
results, throughput analysis, and packet loss and error analysis.
However, due to certain confidentiality reasons, we will refrain from
disclosing specific result data and instead describe them using more abstract
terms, such as moderate load, small latency, and so on. We apologize for any
inconvenience caused by this limitation in reading.

4.1 Results in AMF


Due to limitations in the test environment, the number of IMSIs can only
increase in increments of 10,000 as the minimum unit. Additionally, the test
environment considers real-life scenarios and attempts to evenly distribute the
load across pods in the network function, as depicted in Figure 4.1. Here, all
pods of the AMF are shown, and similarly, the SMF also has several pods to
manage these users as evenly as possible.

37
38 | Results and analysis

Figure 4.1: Pods in AMF.

The theoretical maximum capacity for the test is 1,000,000 IMSIs.The


minimum test unit is 10,000, but due to limitations such as test duration and
server performance, if the test is incremented by the minimum unit, it will
not only greatly extend the test time but also cause serious overlap of test
results. This is because in order to simulate a real-world environment as
closely as possible, the number of users specified is merely an approxeimation
of the proportion of active users. For instance, We have specified a certain
number of users to perform PDU session establishments. The actual number
of participating IMSIs may fluctuate, and this fluctuation is random and may
change within the testing time. Consequently, conducting tests in increments
of 5% or 50,000 units provides the most efficient coverage for each integer
CPU load and results in a smoother average latency curve. However, for
situations with lower loads, using 50,000 as a margin cannot cover all CPU
loads. Therefore, some extra data points were added to the experiment to fill
the gap in the usual approach.

4.1.1 General Results


In this section, we provide an overview of the key findings without presenting
specific numerical data. Due to confidentiality requirements, we are unable to
disclose the detailed information contained in the tables. But still, there are
some features to summarize:

• As the number of IMSIs increases, there is a general trend of higher


Results and analysis | 39

CPU load ranges, indicating increased system usage.

• The average response times exhibit slight fluctuations among different


data points, showing a subtle trend from small latency to a large amount.
The highest average response time is observed within 650, 000 IMSIs.

• The maximum response times vary significantly among different data


points, demonstrating pronounced fluctuations. The highest maximum
response time is observed among 500, 000 IMSIs.

Overall, the data suggests that as the number of IMSIs increases, there is
a gradual increase in CPU load and response times, particularly in the case of
maximum response times. The variation in average and percentile response
times is relatively minimal.

4.1.1.1 Fitting Latency-Corresponding CPU Load Result

Figure 4.2: Fitting latency and corresponding CPU load for AMF.

Figure 4.2 illustrates the latency-load curve of the AMF, which exhibits a
U-shaped pattern with the left side displaying lower latency compared to
the right side. This pattern can be attributed to two main factors. Initially,
the gradually decreasing part with lower latency is due to the slow-start
mechanism implemented at the beginning of network communication. As the
sender gradually increases the data transmission rate, the network load remains
relatively low, leading to lower latency. However, as the load increases, the
latency rises due to the increased demand for network resources. Therefore,
polynomial fitting, specifically up to quadratic order, is qualified to analyze
the data.
40 | Results and analysis

We can simplify the subsequent analysis into four parts: the slow start
phase, the peak efficiency phase, the load escalation phase, and the overload
phase.

4.1.1.2 Fitting Latency-Corresponding CPU Load Distributions


4.1.1.2.1 The P50 Latency-CPU load distributions analysis
The binomial fit equation before the 50.0% percentile is:

y = 9.58 × 10−5 · x2 − 6.81 × 10−3 · x + 2.75 (4.1)

It can be observed that the curve for the p50 percentile is not particularly
distinct due to the limited amount of retained data in Figure 4.3. Although
the overall trend can still be reflected, it is not prominently manifested. In
this fit, the maximum value is in small latency at the highest CPU load, while
the minimum value is slightly smaller than the current maximum latency at a
medium CPU load.

Figure 4.3: Scatter and Fitting in P50 for AMF.

4.1.1.2.2 The P75 Latency-CPU load distributions analysis


The binomial fit equation before the 75.0% percentile is:

y = 3.03 × 10−4 · x2 − 2.13 × 10−2 · x + 3.21 (4.2)


Results and analysis | 41

The figures of P75 already clearly reflect the overall trend in Figure 4.4.
In this fit, the maximum value is nearly reached to medium latency at the
highest CPU load, while the minimum value is a small latency at a slightly
less medium CPU load than P50.

Figure 4.4: Scatter and Fitting in P75 for AMF.

4.1.1.2.3 The P90 Latency-CPU load distributions analysis


The binomial fit equation before the 90.0% percentile is the same as 75.0%
percentile. Which means they have the same max and min values. But the
scatters look a little different in Figure 4.5:
42 | Results and analysis

Figure 4.5: Scatter and Fitting in P90 for AMF.

4.1.1.2.4 The P95 Latency-CPU load distributions analysis


The binomial fit equation before the 95.0% percentile is:

y = 3.89 × 10−4 · x2 − 2.50 × 10−2 · x + 3.26 (4.3)

The graph for the P95 percentile is the most representative in Figure 4.6,
as it reflects the significant impact of slow start and high load on the latency of
the AMF, resulting in increased latency values. In this fit, the maximum value
is a medium latency at the highest CPU load, while the minimum value is in a
low range of latency at a moderate CPU load.
Results and analysis | 43

Figure 4.6: Scatter and Fitting in P95 for AMF.

So in total, the maximum values for AMF always reach the maximum CPU
load, but the minimum values for AMF are around a moderate load level or
medium load level.

4.1.2 Latency Analysis


4.1.2.1 Slow Start Phase
The activation of slow start in the AMF is aimed at ensuring smooth data
transmission and avoiding excessive load pressure on the core network at the
beginning. As observed, with CPU load gradually increasing at the low level,
the average latency exhibited a decrease. This can be regarded as the success
of the slow start strategy. The reduced latency signifies faster delivery of data
packets, enabling efficient processing within the core network.
44 | Results and analysis

Figure 4.7: Slow Start Phase.

4.1.2.2 Peak Efficiency Phase


This phase represents the most efficient moment for the AMF, where the
latency remains unaffected by a slow start and varying load conditions. In this
phase, the average latency exhibited an increase in the small range. It indicates
that the AMF has reached a stable state, efficiently handling data transmission
and processing requests to ensure reliable and timely communication between
network functions.
Results and analysis | 45

Figure 4.8: Peak Efficiency Phase.

4.1.2.3 Load Escalation Phase


As the load gradually increases, the latency of the AMF also increases. In
this stage, the observed minimum average latency is in the small range, and
the maximum average latency is in the medium range. During this phase, the
AMF may encounter congestion, resource contention, or processing delays.
However, initially, it can still maintain relatively low latency until the load
further increases towards the overload level.
46 | Results and analysis

Figure 4.9: Load Escalation Phase.

4.1.2.4 Overload Phase


When the overload protection is triggered, the AMF often discards packets,
as evidenced by the presence of deregistration attempts from user devices
in the captured packets in Figure 4.10, then it will lead to retransmission.
Additionally, during this period, there may be a retransmission of certain
packets, as indicated by the presence of completely duplicated requests in
the captured data. Also, some of them are rejected due to safety mode. The
average latency also significantly increases during this period, rising from an
average of medium range to a peak average of large range.

Figure 4.10: Deregistration from UE in Overload Phase.


Results and analysis | 47

Figure 4.11: Overload Phase.

4.1.3 Overload Mode in AMF


In this mode, there are several different types of messages observed.
• Deregistration request (UE originating): In Figure 4.10, it is likely that
the AMF is unable to fulfill the request sent by the UE, indicating that
it has entered overload protection mode. As a result, the UE initiates a
deregistration request to avoid unnecessary waiting or error handling.
• Paging: The paging procedure is initiated by AMF to page an IDLE
state with 3GPP access UE. Under high load conditions, the AMF faces
a larger volume of operations to process, or simply due to the need to
transmit more data, which leads to packet delays or losses, resulting in
a longer paging cycle.
48 | Results and analysis

• Message retransmission:
As shown in Figure 4.12, this is the repetition of two identical messages
sent from AMF to SMF, where the second message is sent one second
after the first. However, only the second message receives a reply. This
also can be attributed to packet loss or delay.

Figure 4.12: Message retransmission.

• DL NAS transport (Payload was not forwarded): This indicates that the
AMF is experiencing congestion and high load, resulting in the payload
not being correctly processed and forwarded.

4.1.4 Fault and Loss Analysis


In fault and loss analysis, the number of successful packets will be greater than
the statistic. This can be explained by the following example. If we want to
collect 800 successful samples, the actual number of experiments conducted
will be much greater than 800. Therefore, in this analysis, we are considering
the actual samples obtained. The earlier data represents a random sampling of
the desired number of samples when the quantity exceeds the desired sampling
amount, and an analysis is conducted on the selected samples.
For the analysis of packet loss rate, the data presented in Table 4.1 indicates
that the AMF begins to encounter packet loss when the number of users
exceeds approximately 350, 000 and the CPU load approaches a medium to
high load. As the load increases, the loss rate gradually rises. Initially, the
loss rate is minimal with only a few packets being lost. However, as the AMF
approaches its overload threshold, the loss rate escalates significantly. Under
severe overload conditions, the loss rate can reach a maximum of 88.81%. This
result highlights the significant impact of load increase on the packet loss rate.
Therefore, load management strategies and capacity planning are crucial for
the AMF.
For the fault analysis, as shown in Table 4.2, the AMF encounters only
one type of HTTP fault header, which is the 404 not found. These errors
primarily occur due to increased load. In the initial stage, the occurrence of
HTTP fault headers is minimal. Under significantly overloaded conditions,
the maximum occurrence can reach 17.16%. This indicates that as the load
increases, the error rate of the AMF also increases. Understanding these error
Results and analysis | 49

IMSIs num. CPU load level Actual Sessions Loss sessions Loss rate(%)
0–3 × 105 Low/Moderate / / /
3.5 × 105 Medium 958 7 0.73
3.7 × 105 Medium 510 9 1.76
4 × 105 Medium to High 1161 104 8.96
4.5 × 105 Medium to High 1404 343 24.43
5 × 105 High 418 154 36.84
5.5 × 105 High 638 224 35.11
6 × 105 High 342 140 40.94
6.5 × 105 High 134 119 88.81

Table 4.1: Loss rate in AMF.

occurrences is crucial for evaluating and improving the performance of the


AMF. By optimization and load management strategies, it is possible to reduce
the error rate and enhance the reliability and stability of the AMF.
Regarding "404 Not Found," It is possible that the AMF is experiencing
high load, leading to some resources not being found and process to SMF, as
Figure 4.13 shown.

Figure 4.13: 404 not found example in AMF.

IMSIs num. Actual Sessions Fault(404) sessions Fault(404) rate(%)


0–3 × 105 / / /
3.5 × 105 958 0 0
3.7 × 105 510 17 3.33
4 × 105 1161 85 7.32
4.5 × 105 1404 153 10.90
5 × 105 418 50 11.96
5.5 × 105 638 75 11.76
6 × 105 342 44 12.87
6.5 × 105 134 23 17.16

Table 4.2: Fault rate in AMF.


50 | Results and analysis

4.2 Results in SMF


Similar to the AMF case, the test results are typically incremented in units
of 50, 000. However, some additional data points were included below this
interval, particularly when the overall CPU load is not very high, using an
interval of 50, 000 cannot cover all CPU loads. Therefore, it is necessary to
add some points with intervals of 20, 000 or 30, 000. Similarly, the sample
count presented here is obtained through random sampling, ensuring an equal
representation across the overall results.

4.2.1 General Results


The confidentiality requirements are the same here, but still there are some
features to summarize:

• The response time of the SMF does not show a significant fluctuation
trend in most cases, generally falling within the range of medium-range
latency.

• When the number of IMSIs reaches 450, 000 and above, the response
times begin to rise significantly, indicating that the load pressure on the
relevant pods of the SMF also increases significantly.

• The maximum delay of the SMF often has a significant difference from
the average and the minimum, which is caused by the TCP Delayed
Acknowledgment strategy. However, the mitigated average response
time under low load is also due to this strategy.

• Although the average latency decreases to the medium level at around


IMSIs num. 600, 000 and 650, 000, it is important to note that most
of the corresponding CPU loads are at or below high load level. This
means that the AMF and SMF are just starting to work, and the CPU
has not reached a truly high load.
Figure 4.14 shows the CPU load curve for IMSIs num. 650, 000. CPU
load under this number of IMSIs is only at a medium to high level, as
shown in Figure 4.15, which cannot be considered as a high load.
Results and analysis | 51

Figure 4.14: CPU load around 650,000 IMSIs.

Figure 4.15: An example PDU session establishment test in SMF around


650,000 IMSIs.

Additionally, when measuring under true high-load conditions, many


of the results are missing or incorrect. The missing means in a test
case there are many messages with the sending messages but without
reply. And others have typical HTTP fault headers such as 503 service
unavailable or 404 not found and so on which are not the answer we
want. For IMSIs num. 600, 000, there are only 7 valid samples above
very high load, with an average reach to large latency, a maximum value
of more larger number, and a minimum value of a small number. As for
IMSIs num. 650, 000, all samples have CPU loads below a certain high
level.
52 | Results and analysis

4.2.1.1 Fitting Latency-Corresponding CPU Load Result

Figure 4.16: Fitting latency and corresponding CPU load for SMF.

As shown in Figure 4.16, there is a significant difference in the fitting results


at different percentages for the SMF.
Firstly, the CPU load only goes up to a certain high level, because although
the SMF’s server limit of 99.4% is reached in the test, it can also be seen in
the subsequent scatter plot Figure 4.20 that the overall effective samples of the
SMF are all at or below the certain high level. Only 7 cases reach the higher.
So unlike the AMF, there are not many samples above very high for fitting.
Therefore, the fitting process only considered the CPU environment at and
below a certain amount of load level, where most of the samples are located.
In the overall trend, it can also be seen that the fitting curve for SMF is very
smooth in the data at or below P75, and there is basically only an increasing
trend when it comes to high load. The curves for P90 and P95, on the other
hand, show more noticeable changes, especially in the case of high load, where
they increase significantly.

4.2.1.2 Fitting Latency-Corresponding CPU Load Distributions


4.2.1.2.1 The P50 Latency-CPU load distributions analysis
The polynomial fit equation before the 50.0% percentile is:

y = 6.13 × 10−8 · x3 − 3.25 × 10−6 · x2 + 2.92 × 10−3 · x


(4.4)
+ 1.56

As in Figure 4.17, the scatter plot results in P50 are relatively balanced.
While there may be minor variations in the fitted curve, the overall latency
Results and analysis | 53

difference remains within a small range compared to the average latency. In


this fit, the maximum value is in a small range at the highest CPU load, while
the minimum value is slightly smaller than the current maximum value at a
very low CPU load.

Figure 4.17: Scatter and Fitting in P50 for SMF.

4.2.1.2.2 The P75 Latency-CPU load distributions analysis


The polynomial fit equation before the 75.0% percentile is:

y = 2.34 × 10−7 · x3 + 2.45 × 10−4 · x2 − 6.81 × 10−4 · x


(4.5)
+ 2.07

Similarly, compared to the overall average latency, this variation is not


particularly significant in Figure 4.18. Overall latency rises as the load rises.
This also indicates that the TCP delay acknowledgment strategy is effective
for the majority of the data. In this fit, the maximum value is in a small range
at the highest CPU load, while the minimum value is smaller than the current
maximum latency at a slightly higher than P50 but still low CPU load.
54 | Results and analysis

Figure 4.18: Scatter and Fitting in P75 for SMF.

4.2.1.2.3 The P90 Latency-CPU load distributions analysis


The polynomial fit equation before the 90.0% percentile is:

y = 2.13 × 10−5 · x3 − 6.00 × 10−4 · x2 + 8.54 × 10−3 · x


(4.6)
+ 2.12

The data for P90 reflects the relationship between latency and load as
Figure 4.19 shown. However, the scatter plot does not include packets that
are significantly affected by TCP delay acknowledgments. In this fit, the
maximum value is in a large range at the highest CPU load, while the minimum
value is a small latency at a much higher than P75 but still low CPU load.
Results and analysis | 55

Figure 4.19: Scatter and Fitting in P90 for SMF.

4.2.1.2.4 The P95 Latency-CPU load distributions analysis


The polynomial fit equation before the 95.0% percentile is:

y = 5.34 × 10−5 · x3 − 1.90 × 10−3 · x2 − 2.10 × 10−2 · x


(4.7)
+ 4.54

The scatter plot for P95 includes many samples that are affected by TCP
delay acknowledgments in Figure 4.20. And we can observe that when the
load is low, although the previous curves of P50 and P75 are not too affected,
the curve of P95 is affected by some TCP delays at this time, probably around
the moderate level of the load before it is not too affected. These samples
are present in a certain proportion and are widely distributed across various
CPU loads. As a result, the fitting curve for P95 is noticeably higher than the
other fitting curves. In this fit, the maximum value is a large range latency at
the highest CPU load, while the minimum value is a small range latency at a
slightly higher than P90 but still low CPU load.
56 | Results and analysis

Figure 4.20: Scatter and Fitting in P95 for SMF.

So in total, the maximum values for SMF always reach the maximum CPU
load around the highest CPU load we set, but the minimum values for SMF
vary in low to moderate load levels.

4.2.2 Latency Analysis


For SMF, the distribution of latency does not exhibit the same pronounced
influencing factors as in AMF. The variation in latency is relatively smooth
below a certain high CPU load, while beyond that load level, it enters a high-
load state. However, there is also a noticeable tail in the latency distribution
for SMF that warrants further analysis.

4.2.2.1 Beginning Phase


In this part, although the latency remains relatively constant for the P50,
P75, and P90 curves, the latency for P95 depends on the proportion of the
"little tail," which is influenced by the TCP Delayed Acknowledgment strategy.
When the background load is not very high, the overall data processing load
for SMF may not be substantial. Upon the arrival of a data packet at SMF, it
does not immediately send an acknowledgment but waits for a certain period
of time, typically costing a large amount of time. As shown in Figure 4.21
and subsequent Figure 4.22, the main difference lies in the number of TCP
Results and analysis | 57

delayed acknowledgments. The most notable change is that the dashed line
representing P90 gradually shifts from the back to the front

Figure 4.21: Normal Phase.

During this period, for most of the latency distribution, it is close to a


normal distribution within the range from small to medium amount of latency.

4.2.2.2 Peak Efficiency Phase


In this section, two observations can be made. First, as mentioned earlier,
the TCP delayed acknowledgments decrease significantly due to the increase
in background load. Second, the majority of latencies still follow a normal
distribution within the range from small to medium amount of latency, with
the most common latency being in the small range. This segment represents
the highest average efficiency for SMF.
58 | Results and analysis

Figure 4.22: Peak Efficiency Phase.

4.2.2.3 Load Escalation Phase


In this phase, the first half of the SMF latency distribution no longer exhibits
a sharp peak as before in a normal distribution. Instead, there is a larger
distribution of latency in the right half. The proportion of the "tail" part
is also not as prominent as before, ranging only between 1 − 5%, due to
the significant increase in load. Under high load conditions, the system’s
resources become more constrained, leading to the suppression or delay of
the latency acknowledgment mechanism, as the system needs to process and
forward packets faster to cope with the increased load.
Results and analysis | 59

Figure 4.23: High Load Phase.

4.2.3 Fault and Loss Analysis


For packet loss rate, SMF starts experiencing packet loss at around 370, 000
users, corresponding to approximately a medium CPU load. The packet loss
rate gradually increases with the increasing load, from an initial rate of low
level to a later rate of nearly high level. This indicates that as the workload
on SMF intensifies, the system resources become more strained, leading to a
higher likelihood of packet loss.
The results of the error analysis are given in Table 4.4. Overall, as the
number of users and the load increase, the error rate of SMF also shows an
upward trend. The specific errors observed are related to HTTP fault headers.
Regarding "404 Not Found," the error samples increased from low to
moderate load levels. This error primarily occurs when SMF sends a PDU
session establishment accept to AMF in Figure 4.24. At this point, AMF is
60 | Results and analysis

IMSIs num. CPU load level Actual Sessions Loss sessions Loss rate(%)
0–3.5 × 105 Low to Medium / / /
3.7 × 105 Medium 510 30 5.88
4 × 105 Medium 1054 205 19.45
4.5 × 105 Medium to High 1395 526 37.71
5 × 105 Medium to High 338 132 39.05
5.5 × 105 Medium to High 552 215 38.95
6 × 105 Medium to High 270 122 45.19
6.5 × 105 Medium to High 45 26 57.78

Table 4.3: Loss rate in SMF.

IMSIs AS F404S F404R F503S F503R F504S F504R F500S F500R


0–3.5 / / / / / / / / /
3.7 510 17 3.33 0 0 0 0 0 0
4 1054 85 8.06 0 0 0 0 0 0
4.5 1395 225 16.12 1 0.07 1 0.07 0 0
5 338 64 18.93 11 3.25 0 0 0 0
5.5 552 107 19.38 7 1.27 0 0 0 0
6 270 53 19.63 9 3.33 4 1.48 1 0.37
6.5 45 12 26.67 0 0 4 8.89 0 0

Table 4.4: Combined Fault Rate Table. IMSIs are in units of 105 . AS: Actual
Sessions, F404S: Fault(404) sessions, F404R: Fault(404) rate(%), F503S:
Fault(503) sessions, F503R: Fault(503) rate(%), F504S: Fault(504) sessions,
F504R: Fault(504) rate(%), F500S: Fault(500) sessions, F500R: Fault(500)
rate(%).
Results and analysis | 61

likely to be under overload protection, resulting in SMF being unable to obtain


the requested resources. As a result, SMF responded with a cancellation.
Normally, UDM would reply to SMF or at least send an empty response.
However, in some cases, the requested resource is not found. This could be
due to SMF not handling it correctly or forwarding the invalid request from
AMF to UDM.

Figure 4.24: 404 Example in SMF

"503 Service Unavailable" occurs at the beginning of registration requests


when SMF initiates a subscriber data management request to UDM as Figure
4.25 shows. In this case, SMF is under high load and cannot promptly process
the request or respond, causing UDM to be unable to provide the required
subscriber data management service, resulting in a 503 service unavailable
error.

Figure 4.25: 503 and 504 Example in SMF.

"504 Gateway Time-out" indicates that SMF times out while completing a
request as Figure 4.25 shown. This often occurs when AMF initiates a request
to SMF at the beginning. SMF, being under high load, fails to respond in a
timely manner, exceeding the predefined waiting time for AMF and resulting
in this error.
"500 Internal Server Error" may indicate insufficient resources in SMF to
fulfill the request processing as Figure 4.26 shown. As this occurrence is rare
and happens when SMF is under high load, it is unlikely to be a software or
code issue.

Figure 4.26: 500 Example in SMF.

4.3 Results in UPF


For UPF, the data exhibits significant variability. It’s important to note that
the latency mentioned here includes not only the PFCP Session Establishment
62 | Results and analysis

Request and Response within the UPF but also the time taken for requests sent
from the SMF and sent back to the SMF. Due to some malfunctions in UPF
trace log functionality, it was not possible to conduct independent testing, so
this test was approached from the perspective of the SMF.

4.3.1 General Results


We can observe the result in UPF that some fluctuations in the average latency.
The average latency exhibits a notable range, with the lowest recorded value
significantly differing from the highest. There is also a significant difference
between the maximum and minimum values of the data.

4.3.2 Latency Analysis


Generally speaking, the latency is distributed along the axis. It is worth
noting that when the average latency is relatively high, the graph may exhibit
a bimodal shape as Figure 4.27 and 4.29, when the latency is relatively low,
the graph may resemble a normal distribution shape like Figure 4.28.

Figure 4.27: Around 100,000 Users in UPF.


Results and analysis | 63

Figure 4.28: Around 200,000 Users in UPF.

Figure 4.29: Around 400,000 Users in UPF.


Chapter 5

Conclusion and Future Work

5.1 Conclusion
This thesis aims to measure and analyze the performance of some core
Network Functions in 5GC networks’ practical scenario: PDU session
establishment. By understanding the actual performance of these NFs,
particularly in terms of latency and resource utilization, we can determine if
the deployed NFs are functioning properly and assess their performance in
real-world applications. Importantly, by analyzing the latency contribution
of these network functions, we can gain insights into the overall network
operation and make timely adjustments to relevant parameters in order to
provide the best user experience.
To carry out the practical measurements, we gradually increase the number
of active users in the Traffic Generator while observing the performance
changes of various network functions and recording the results. After
documenting these functionalities, we organize the results and generate the
latency and servers’ load performance curves for the AMF and SMF, and
we analyze the variation in latency of the UPF under different levels of user
activity. This information is beneficial for practical testing and application.
Therefore, in terms of overall outcomes, this thesis discusses how to
infer the latency contribution of each network function to the entire network,
achieved through running the same scenario tests under varying loads. If these
network devices can respond to and process user requests in a normal and
stable manner, we can deduce their dependence on network loads, i.e., the
range of loads under different scenarios. Regarding the relationship between
E2E latency and network load, in general, as the load increases, the end-
to-end latency of network functions also increases. Notably, a significant

64
Conclusion and Future Work | 65

increase in latency becomes evident when the number of users approaches


approximately 400,000. This latency spike occurs for both the AMF and SMF,
signifying a heavy load level for both functions. After a significant increase in
load, the AMF activates overload protection, resulting in delayed or discarded
processing of some packets. As a result, the AMF experiences an increase
in packet loss rate, while the SMF exhibits a higher number of errors. For
the entire system, running becomes quite challenging when the user count
reaches 600, 000, and it typically limits access when the overall load reaches
overload mode and the latency becomes unpredictable. Due to measures
taken to optimize transmission efficiency under low loads, the overall best
efficiency is often achieved under moderate loads. For the AMF, this occurs at
around a certain moderate load, corresponding to a user count of 250, 000
to 320, 000. Similarly, for the SMF, the optimal load ranges from light to
moderate level, corresponding to a user count of around 250, 000 to 350, 000.
Therefore, the recommended operational load during normal circumstances
is roughly between 250, 000 and 320, 000 users, and for achieving the overall
best performance, testing should be conducted with a load that stabilizes at the
moderate load level, corresponding to 300, 000 to 320, 000 users.

5.2 Limitations
Another issue is that the UPF trace log functionality encountered some errors,
making it difficult to measure the load of the UPF. Additionally, the latency
of the UPF can only be observed through the SMF, although there may be
intermediate transfer time. Nevertheless, it is evident that the latency of the
UPF becomes significantly long, posing challenges for the analysis in this
aspect.

5.3 Future work


We can explore the following aspects to improve the performance of the
evaluation to the 5GC network:

1. Once the UPF functionality could respond to requests within a


reasonable timeframe, we can analyze the latency distribution of UPF
to understand the contribution of the UPF to overall latency.

2. By adding more test points, we can strive to achieve a more evenly


distributed scatter plot, which would enhance the accuracy of the results.
66 | Conclusion and Future Work

Increasing the number of data points across various load levels would
contribute to a more comprehensive analysis.

3. We should incorporate testing scenarios beyond PDU session


establishment. Several other scenarios require coordination among
multiple network elements. For instance, scenarios like Network
Triggered Service and N2 Handover should also be analyzed to gain a
comprehensive understanding of their performance.
References

[1] Ericsson, “5gc introduce,” 2023. [Online]. Available: https://www.


ericsson.com/en/core-network/5g-core

[2] K. K. Jha, Nishant, A. K. Jangid, and D. Das, “Method for


improvement in throughput and network resource utilization in
5g,” in 2019 IEEE International Conference on Electronics,
Computing and Communication Technologies (CONECCT), 2019.
doi: 10.1109/CONECCT47791.2019.9012922 pp. 1–6.

[3] G. Pocovi, T. Kolding, M. Lauridsen, R. Mogensen, C. Markmller,


and R. Jess-Williams, “Measurement framework for assessing
reliable real-time capabilities of wireless networks,” IEEE
Communications Magazine, vol. 56, no. 12, pp. 156–163, 2018.
doi: 10.1109/MCOM.2018.1800159

[4] D. Fadhil and R. Oliveira, “Estimation of 5g core and ran end-to-end


delay through gaussian mixture models,” Comput., vol. 11, p. 184, 2022.

[5] 3GPP, “System architecture for the 5g system,” 3rd Generation


Partnership Project (3GPP), Tech. Rep. 3GPP TS 23.501 version 16.6.0
Release 16, Sep. 2020. [Online]. Available: https://www.3gpp.org/ftp/
/Specs/archive/23_series/23.501/

[6] A. Gupta and R. K. Jha, “A survey of 5g network: Architecture and


emerging technologies,” IEEE Access, vol. 3, pp. 1206–1232, 2015. doi:
10.1109/ACCESS.2015.2461602

[7] M. Skulysh and O. Klimovych, “Approach to virtualization of evolved


packet core network functions,” in The Experience of Designing
and Application of CAD Systems in Microelectronics, 2015. doi:
10.1109/CADSM.2015.7230833 pp. 193–195.

67
68 | REFERENCES

[8] Y.-i. Choi and N. Park, “Slice architecture for 5g core network,” in
2017 Ninth International Conference on Ubiquitous and Future Networks
(ICUFN), 2017. doi: 10.1109/ICUFN.2017.7993854 pp. 571–575.

[9] R. Mijumbi, J. Serrat, J.-L. Gorricho, N. Bouten, F. De Turck,


and R. Boutaba, “Network function virtualization: State-of-the-art
and research challenges,” IEEE Communications Surveys & Tutorials,
vol. 18, no. 1, pp. 236–262, 2016. doi: 10.1109/COMST.2015.2477041

[10] M. S. Bonfim, K. L. Dias, and S. F. L. Fernandes, “Integrated nfv/sdn


architectures,” ACM Computing Surveys (CSUR), vol. 51, pp. 1 – 39,
2018.

[11] Y. Wu, H. Dai, H. Wang, Z. Xiong, and S. Guo, “A survey of


intelligent network slicing management for industrial iot: Integrated
approaches for smart transportation, smart energy, and smart factory,”
IEEE Communications Surveys & Tutorials, vol. 24, pp. 1175–1211,
2022.

[12] A. S. Khatouni, F. Soro, and D. Giordano, “A machine learning


application for latency prediction in operational 4g networks,” in 2019
IFIP/IEEE Symposium on Integrated Network and Service Management
(IM), 2019, pp. 71–74.

[13] C. H. T. Arteaga, A. Ordoñez, and O. M. C. Rendon, “Scalability and


performance analysis in 5g core network slicing,” IEEE Access, vol. 8,
pp. 142 086–142 100, 2020. doi: 10.1109/ACCESS.2020.3013597

[14] Z. Hou, C. She, Y. Li, L. Zhuo, and B. Vucetic, “Prediction


and communication co-design for ultra-reliable and low-latency
communications,” IEEE Transactions on Wireless Communications,
vol. 19, no. 2, pp. 1196–1209, 2020. doi: 10.1109/TWC.2019.2951660

[15] Y. R. Wei, A. S. Keshavamurthy, R. Wittmann, and A. R. Zahonero, “A


standalone 5g industrial testbed design considerations for industry 4.0,”
2022 52nd European Microwave Conference (EuMC), pp. 884–887,
2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:
253251372

[16] F. Malandrino and C.-F. Chiasserini, “5g traffic forecasting: If verticals


and mobile operators cooperate,” 2019 15th Annual Conference on
Wireless On-demand Network Systems and Services (WONS), pp. 79–82,
REFERENCES | 69

2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:


57373838

[17] M. A. Tunc, E. Gures, and I. Shayea, “A survey on iot smart


healthcare: Emerging technologies, applications, challenges, and
future trends,” ArXiv, vol. abs/2109.02042, 2021. [Online]. Available:
https://api.semanticscholar.org/CorpusID:237420473

[18] E. Dahlman, S. Parkvall, and J. Skold, 5G NR: The Next Generation


Wireless Access Technology. Academic Press, 2018.

[19] Wireshark, “Wireshark user’s guide,” 2023. [Online].


Available: https://www.wireshark.org/docs/wsug_html_chunked/
ChapterIntroduction.html

[20] ——, “Tshark user’s guide,” 2023. [Online]. Available: https:


//www.wireshark.org/docs/man-pages/tshark.html

[21] David Torrejón Vaquerizas, “Pac manager,” 2023. [Online]. Available:


https://sites.google.com/site/davidtv/

[22] Pungki Arianto, “Pac manager: A remote ssh/ftp/telnet session


management tool,” 2023. [Online]. Available: https://www.tecmint.
com/pac-manager-a-remote-ssh-session-management-tool/
70 | REFERENCES
TRITA – EECS-EX-2023:788

www.kth.se

You might also like