Oracle® Linux 8 Monitoring and Tuning The System

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Oracle® Linux 8

Monitoring and Tuning the System

F24025-03
February 2020
Oracle Legal Notices

Copyright © 2019,2020 Oracle and/or its affiliates. All rights reserved.

This software and related documentation are provided under a license agreement containing restrictions on use and
disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement
or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute,
exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or
decompilation of this software, unless required by law for interoperability, is prohibited.

The information contained herein is subject to change without notice and is not warranted to be error-free. If you find
any errors, please report them to us in writing.

If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of
the U.S. Government, then the following notice is applicable:

U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any
programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial
computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental
regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any
operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be
subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S.
Government.

This software or hardware is developed for general use in a variety of information management applications. It is not
developed or intended for use in any inherently dangerous applications, including applications that may create a risk
of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to
take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation
and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous
applications.

Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their
respective owners.

Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used
under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD
logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a
registered trademark of The Open Group.

This software or hardware and documentation may provide access to or information about content, products, and
services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all
warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an
applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any
loss, costs, or damages incurred due to your access to or use of third-party content, products, or services, except as
set forth in an applicable agreement between you and Oracle.

Abstract

Oracle® Linux 8: Monitoring and Tuning the System provides information about monitoring system performance
by using various utilities and also includes instructions on how configure and use Tuned for improved system
performance.

Document generated on: 2020-02-12 (revision: 9257)


Table of Contents
Preface .............................................................................................................................................. v
1 Working With Tuned ........................................................................................................................ 1
1.1 About Tuned ........................................................................................................................ 1
1.1.1 About Tuned Profiles ................................................................................................. 1
1.1.2 About the Default Tuned Profiles ................................................................................ 2
1.2 About Static and Dynamic Tuning in Tuned ........................................................................... 3
1.3 Installing and Enabling Tuned by Using the Command Line .................................................... 3
1.4 Running Tuned in no-daemon Mode ..................................................................................... 4
1.5 Administering Tuned ............................................................................................................. 4
1.5.1 Listing Tuned Profiles ................................................................................................ 4
1.5.2 Activating a Tuned Profile .......................................................................................... 4
1.5.3 Disabling Tuned ........................................................................................................ 5
2 Monitoring the System ..................................................................................................................... 7
2.1 Working With the sosreport Utility ......................................................................................... 7
2.2 Working With System Performance Diagnostics Utilities ......................................................... 8
2.2.1 About System Performance Utilities ............................................................................ 9
2.2.2 Monitoring System Resource Usage ........................................................................... 9
2.2.3 Monitoring CPU Usage ............................................................................................ 10
2.2.4 Monitoring Memory Usage ....................................................................................... 11
2.2.5 Monitoring Block I/O Usage ...................................................................................... 11
2.2.6 Monitoring File System Usage .................................................................................. 12
2.2.7 Monitoring Network Usage ....................................................................................... 12
2.2.8 Working With the Graphical System Monitor .............................................................. 12
2.3 Working With OSWatcher Black Box ................................................................................... 12
2.3.1 Installing OSWbb ..................................................................................................... 13
2.3.2 Running OSWbb ...................................................................................................... 13
2.3.3 Analysing OSWbb Archived Files ............................................................................. 14
3 Automating System Tasks ............................................................................................................. 17
3.1 About Auditing the System .................................................................................................. 17
3.2 Working With System Log files ............................................................................................ 18
3.2.1 About Logging Configuration (/etc/rsyslog.conf) ......................................................... 18
3.2.2 Configuring Logwatch ............................................................................................... 22
3.3 Using Process Accounting .................................................................................................. 22
4 Working With Kernel Dumps .......................................................................................................... 23
4.1 About Kdump ..................................................................................................................... 23
4.2 Kdump Installation and Configuration .................................................................................. 23
4.2.1 Files That Are Used by Kdump ................................................................................ 24
4.2.2 Installing and Configuring Kdump ............................................................................. 24
4.2.3 Configuring the Kdump Output Location .................................................................... 25
4.2.4 Configuring the Default Kdump Failure State ............................................................. 25
4.3 Analyzing Kdump Output .................................................................................................... 26
4.4 Using Early Kdump ............................................................................................................. 26

iii
iv
Preface
Oracle® Linux 8: Monitoring and Tuning the System describes the various utilities, features, and services
that you can use to monitor system performance, detect performance issues, and improve the performance
of various system components.

Audience
This document is intended for administrators who need to configure and administer Oracle Linux. It is
assumed that readers are familiar with web technologies and have a general understanding of using the
Linux operating system, including knowledge of how to use a text editor such as emacs or vim, essential
commands such as cd, chmod, chown, ls, mkdir, mv, ps, pwd, and rm, and using the man command to
view manual pages.

Document Organization
The document is organized into the following chapters:

• Chapter 1, Working With Tuned describes how to use tuned to administer the system.

• Chapter 2, Monitoring the System describes how to use sosreport, oswatcher to monitor the system
and improve it with the provided tuning tools. traffic.

• Chapter 3, Automating System Tasks describes how to automate system auditing and logging.

• Chapter 4, Working With Kernel Dumps describes how to diagnose problems with kernel dumps.

Related Documents
The documentation for this product is available at:

Oracle® Linux 8 Documentation

Conventions
The following text conventions are used in this document:

Convention Meaning
boldface Boldface type indicates graphical user interface elements associated with an
action, or terms defined in text or the glossary.
italic Italic type indicates book titles, emphasis, or placeholder variables for which
you supply particular values.
monospace Monospace type indicates commands within a paragraph, URLs, code in
examples, text that appears on the screen, or text that you enter.

Documentation Accessibility
For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website
at
https://www.oracle.com/corporate/accessibility/.

v
Access to Oracle Support

Access to Oracle Support


Oracle customers that have purchased support have access to electronic support through My Oracle
Support. For information, visit
https://www.oracle.com/corporate/accessibility/learning-support.html#support-tab.

vi
Chapter 1 Working With Tuned

Table of Contents
1.1 About Tuned ................................................................................................................................ 1
1.1.1 About Tuned Profiles ......................................................................................................... 1
1.1.2 About the Default Tuned Profiles ........................................................................................ 2
1.2 About Static and Dynamic Tuning in Tuned ................................................................................... 3
1.3 Installing and Enabling Tuned by Using the Command Line ............................................................ 3
1.4 Running Tuned in no-daemon Mode ............................................................................................. 4
1.5 Administering Tuned ..................................................................................................................... 4
1.5.1 Listing Tuned Profiles ........................................................................................................ 4
1.5.2 Activating a Tuned Profile .................................................................................................. 4
1.5.3 Disabling Tuned ................................................................................................................ 5

This chapter describes the Tuned feature and Tuned profiles and includes tasks for using Tuned to
optimize performance on your Oracle Linux systems.

1.1 About Tuned


The Tuned application monitors a system to optimize its performance under certain conditions. Tuned uses
several predefined profiles to tune your system. The profiles that are provided are designed for particular
use cases and fall into one of the following two categories: power-saving profiles and performance-
boosting profiles. Performance-boosting profiles address low latency and high throughput for storage and
the network and virtualization host performance.

You can modify the rules that are defined for each profile, as well as customize how a specific device is
tuned by using a specific profile. In addition, you can configure Tuned so that any changes in device usage
triggers an adjustment in the current settings so that the performance of active devices is improved and
power consumption for inactive devices is reduced.

1.1.1 About Tuned Profiles


The following Tuned profiles are installed with Oracle Linux 8:

• balanced (default profile): Is a power-saving profile. This profile provides a balance between
performance and power consumption. The profile uses auto-scaling and auto-tuning when possible. A
possible drawback is increased latency.

• powersave: Is a profile that provides maximum power saving performance. The profile can minimize
actual power consumption by throttling performance.

Note

In some instances, the balanced profile is a better choice than the powersave
profile, as it is more efficient.

• throughput-performance (default profile): Is a server profile that is optimized for high throughput.
The profile disables power-savings mechanisms and enables sysctl settings to improve the throughput
performance of the disk and network IO.

• latency-performance: Is a server profile that is optimized for low latency. The profile disables power-
savings mechanisms and enables sysctl settings to improve latency.

1
About the Default Tuned Profiles

• network-latency: Is a profile that provides low latency network tuning and is based on the latency-
performance profile. In addition, this profile disables transparent huge pages and NUMA balancing and
tunes several network-related sysctl settings.

• network-throughput: A profile for throughput network tuning. It is based on the throughput-


performance profile. In addition, this profile increases kernel network buffers.

• virtual-guest (default profile): Is a profile that is designed for virtual guests and is based on the
throughput-performance profile. This profile decreases virtual memory swappiness and increases
disk readahead values.

• virtual-host: Is a profile that is designed for virtual hosts and is based on the throughput-
performance profile that. This profile decreases virtual memory swappiness, increases disk readahead
values, and enables a more aggressive value of dirty pages writeback.

• desktop: Is a profile that is optimized for desktop environments and is based on the balanced profile.
In addition, this profile enables scheduler autogroups for better response of interactive applications.

Note

You can install additional profiles to better match your system configuration and
intended use case. For example, if you are using a real-time kernel with Oracle
Linux, you can use a real-time profile. These optional packages can be installed
from the __addons channel.

Note that real-time profiles will have no effect on kernels that are not compiled with
real-time support enabled.

To list all of the profiles that are currently available on your system, use the dnf
command:
# dnf list tuned-profiles*

Tuned profiles that are installed on the system by default are stored in the /usr/lib/tuned and /etc/
tuned directories. Distribution-specific profiles are stored in the /usr/lib/tuned directory. Note that
each profile has its own directory. Each profile directory consists of a main configuration file, tuned.conf,
as well as other optional files.

If you want to use a custom profile, copy the profile directory to the /etc/tuned directory, which is where
custom profiles are stored. In the event there are two profiles with the same name, the custom profile that
is located in /etc/tuned/ is used.

The tuned.conf file can contain one [main] section and additional sections for configuring plug-in
instances. Note that these sections are optional. For more information about profile configuration, see the
tuned.conf(5) manual page.

1.1.2 About the Default Tuned Profiles


A default Tuned profile is automatically selected when you install Oracle Linux. The default profile that is
selected is based on the given environment and the performance goals to be achieved in that particular
use case. The following default profiles are provided:

• throughput-performance: Is a profile that is used in an environment where compute nodes are


running Oracle Linux. This profile achieves the best throughput performance.

• virtual-guest: Is a profile that is used in an environment where virtual machines are running Oracle
Linux. This profile achieves the best performance. If you are not interested in the best performance, you
can change the profile to either the balanced or powersave profile.

2
About Static and Dynamic Tuning in Tuned

• balanced: Is a profile that is used for other use cases. This profile achieves balanced performance and
power consumption.

1.2 About Static and Dynamic Tuning in Tuned


Static tuning applies settings that you have defined in the configuration files for sysctl, sysfs, and other
system configuration tools throughout the operating system.

You can configure the tuned service to monitor the activity of system components and dynamically tuned
system settings, based on information that the service collects about the system and its current running
state.

Dynamic tuning can be particularly useful in situations where you need the load on devices like the CPU,
hard drives, and network adapters to consume as little power as possible when idle, but require high
throughput and low latency when under a high load.

To enable dynamic tuning, set the correct value in the /etc/tuned/tuned-main.conf settings file as
follows:

dynamic_tuning = 1

You must then set the time interval in seconds for tuned to analyze the current system state in the same
configuration file so that it can dynamically tune the system, based on the collected results, for example:

update_interval = 10

1.3 Installing and Enabling Tuned by Using the Command Line


The following procedure describes how to install and enable Tuned, install Tuned profiles, and preset a
default Tuned profile for your Oracle Linux systems.

1. If the tuned package is not already installed, install it:

# dnf install tuned

2. Enable and start the tuned service:

# systemctl enable --now tuned

3. Check the active Tuned profile:

# tuned-adm active

Current active profile: balanced

4. Verify that the Tuned profile is applied to the system:

# tuned-adm verify

Verfication succeeded, current system settings match the preset profile.


See tuned log file ('/var/log/tuned/tuned.log') for details.

If a message indicating the current system settings do not match is displayed, try restarting the tuned
service:

# systemctl start tuned

3
Running Tuned in no-daemon Mode

1.4 Running Tuned in no-daemon Mode


Running tuned in no-daemon mode does not require any resident memory. When you are running the
service in this mode, tuned applies the settings and then exits.

To enable dynamic tuning, set the correct value in the /etc/tuned/tuned-main.conf settings file:
daemon = 0

If you choose to run tuned in no-daemon mode, be aware that tuned no longer supports D-Bus services
or the Hot-plug kernel subsystem. Thus, tuned can no longer automatically roll back any settings files that
were changed.

1.5 Administering Tuned


You administer Tuned by using the tuned-adm command. The following tasks describe how to administer
Tuned profiles and the tuned service on your Oracle Linux systems.

For more information, see the tuned-adm(8) and tuned(8) manual pages.

1.5.1 Listing Tuned Profiles


To list all of the available Tuned profiles on a system:
# tuned-adm list
Available profiles:
- balanced - General non-specialized tuned profile
- desktop - Optimize for the desktop use-case
- latency-performance - Optimize for deterministic performance at the cost of increased power consumpt
- network-latency - Optimize for deterministic performance at the cost of increased power consumpt
- network-throughput - Optimize for streaming network throughput, generally only necessary on older C
- powersave - Optimize for low power consumption
- throughput-performance - Broadly applicable tuning that provides excellent performance across a variety
- virtual-guest - Optimize for running inside a virtual guest
- virtual-host - Optimize for running KVM guests
Current active profile: balanced

The current active profile is also displayed with this output.

To display just the currently active profile:


# tuned-adm active
Current active profile: balanced

1.5.2 Activating a Tuned Profile


The following procedure describes how to activate a Tuned profile by using the command line. To activate
a Tuned profile by using the Cockpit web console, see Oracle Linux: Use Cockpit to Set Up Performance
Profiles.

Note

To activate a Tuned profile, the tuned service must be running on your system.

Use the following command activate a specific selected Tuned profile:


# tuned-adm profile profile-name

4
Disabling Tuned

To have Tuned recommend the profile that is most suitable for your system, use the tuned-adm
recommend command:
# tuned-adm recommend
virtual-guest

To activate a combination of multiple profiles, use the following command syntax:


# tuned-adm profile profile1 profile2

1.5.3 Disabling Tuned


To disable tuning temporarily, use the following command:
# tuned-adm off

Running the previous command disables any tuning settings until you restart the tuned service. When you
restart the service, all of the previous tuning settings are re-applied.

You can disable tuning on a more permanent basis by stopping and disabling the tuned service as
follows:
# systemctl disable --now tuned

5
6
Chapter 2 Monitoring the System

Table of Contents
2.1 Working With the sosreport Utility ................................................................................................. 7
2.2 Working With System Performance Diagnostics Utilities ................................................................. 8
2.2.1 About System Performance Utilities .................................................................................... 9
2.2.2 Monitoring System Resource Usage ................................................................................... 9
2.2.3 Monitoring CPU Usage .................................................................................................... 10
2.2.4 Monitoring Memory Usage ............................................................................................... 11
2.2.5 Monitoring Block I/O Usage .............................................................................................. 11
2.2.6 Monitoring File System Usage .......................................................................................... 12
2.2.7 Monitoring Network Usage ............................................................................................... 12
2.2.8 Working With the Graphical System Monitor ..................................................................... 12
2.3 Working With OSWatcher Black Box ........................................................................................... 12
2.3.1 Installing OSWbb ............................................................................................................. 13
2.3.2 Running OSWbb .............................................................................................................. 13
2.3.3 Analysing OSWbb Archived Files ..................................................................................... 14

This chapter describes several features, utilities, and methods that you can use to monitor your Oracle
Linux systems to ensure and promote optimal performance.

2.1 Working With the sosreport Utility


The sosreport utility collects information about a system such as hardware configuration, software
configuration, and operational state. You can also use sosreport to enable diagnostics and analytical
functions. To assist in troubleshooting a problem, sosreport records the information in a compressed file
that you can send to a support representative.

If the sos package is not already installed on your system, install it by using the dnf install command.

To list the available plugins and plug-in options, use the following command:
# sosreport -l
The following plugins are currently enabled:

acpid acpid related information


anaconda Anaconda / Installation information
.
.
.
The following plugins are currently disabled:

amd Amd automounter information


cluster cluster suite and GFS related information
.
.
.
The following plugin options are available:
apache.log off gathers all apache logs
auditd.syslogsize 15 max size (MiB) to collect per syslog file
.
.
.

See the sosreport(1) manual page for information about how to enable or disable plugins, and how to
set values for plug-in options.

7
Working With System Performance Diagnostics Utilities

To use the sosreport utility:

1. Run the sosreport command with desired options to tailor the report to provide information about a
specified problem area:

# sosreport [options ...]

For example, to record only information about Apache and Tomcat and to gather all of the Apache logs,
you would use the following command:

# sosreport -o apache,tomcat -k apache.log=on

sosreport (version 3.6)


.
.
.
Press ENTER to continue, or CTRL-C to quit.

The following example shows how you would enable all boolean options for all loaded plugins
(excluding the rpm.rpmva plugin) and verify all packages:

# sosreport -a -k rpm.rpmva=off

Note that this process can take a considerable amount of time.

2. When prompted, type Enter and then provide any additional information that is required:

Please enter your first initial and last name [email_address]: AName
Please enter the case number that you are generating this report for: case#

Running plugins. Please wait ...

Completed [55/55] ...


Creating compressed archive...

Your sosreport has been generated and saved in:


/tmp/sosreport-AName.case#-datestamp-ID.tar.xz

The md5sum is: checksum

Please send this file to your support representative.

The sosreport command saves the report as an xz-compressed tar file in the /tmp directory.

2.2 Working With System Performance Diagnostics Utilities


Performance issues can be caused by a number of system's components, including software or hardware,
as well as any related interactions. Many performance diagnostics utilities are available in Oracle Linux and
include tools that monitor and analyze the resource usage of different hardware components, as well as
tracing tools for diagnosing performance issues in multiple processes and related threads.

Many performance issues are the result of configuration errors. You can avoid these errors by using a
validated configuration that has been pre-tested for the supported software, hardware, storage, drivers,
and networking components. A validated configuration incorporates best practices for an Oracle Linux
deployment and has undergone real-world testing of the complete stack. Oracle publishes many validated
configurations, which are freely available for download. Refer to the release notes for the release that you
are running for additional recommendations on kernel parameter settings.

8
About System Performance Utilities

2.2.1 About System Performance Utilities


The following utilities enable you to collect information about system resource usage and errors, and help
you to identify performance problems that are caused by overloaded disks, network, memory, or CPUs:

dmesg Displays the contents of the kernel ring buffer, which can contain errors
about system resource usage. Provided by the util-linux package.

dstat Displays statistics about system resource usage. Provided by the


dstat package.

free Displays the amount of free and used memory in the system. Provided
by the procps package, which is install by default in Oracle Linux 8.

iostat Reports I/O statistics. Provided by the sysstat package.

iotop Monitors disk and swap I/O on a per-process basis. Provided by the
iotop package.

ip Reports network interface statistics and errors. Provided by the


iproute package, which is installed by default in Oracle Linux 8.

mpstat Reports processor-related statistics. Provided by the sysstat package.

nfsiostat Reports I/O statistics for NFS mounts. Provided by the nfs-utils
package.

sar Reports information about system activity. Provided by the sysstat


package.

ss Reports network interface statistics. Provided by the iproute package.

top Provides a dynamic real-time view of the tasks that are running on a
system. Provided by the procps package.

uptime Displays the system load averages for the past 1, 5, and 15 minutes.
Provided by the procps package.

vmstat Reports virtual memory statistics. Provided by the procps package,


which is installed by default in Oracle Linux 8.

Many of these utilities provide overlapping functionality. For more information, see the individual manual
page for the utility.

2.2.2 Monitoring System Resource Usage


To ensure that you are provided with a continuous record of a system's performance, you should regularly
collect and monitor system resources. You can establish a baseline of acceptable measurements under
typical operating conditions. You can then use the baseline as a point of reference to make it easier to
identify memory shortages, spikes in resource usage, and other problems when they occur. Monitoring
system performance also enables you to plan for future growth and also determine how configuration
changes might affect future performance.

To run a monitoring command in per-second intervals and watch the output change in real time, use
the watch command, as shown in the following example, where the mpstat command is run once per
second:

9
Monitoring CPU Usage

# watch -n interval mpstat

Alternatively, running the following command enables you to specify the sampling interval in seconds:
# mpstat interval

If installed, the sar command records statistics every 10 minutes while the system is running and retains
the information for every day of the current month.

For example, the following command displays all of the statistics that sar recorded for the day (DD) of the
current month that is specified:
# sar -A -f /var/log/sa/saDD

To run the sar command as a background process and collect data in a file to be displayed later by using
the -f option, you would use the following command:
# sar -o datafile interval count >/dev/null 2>&1 &

where count is the number of samples to record.

Oracle OSWatcher Black Box (OSWbb) and OSWbb analyzer (OSWbba) are useful tools for collecting and
analysing performance statistics. For more information, see Section 2.3, “Working With OSWatcher Black
Box”.

2.2.3 Monitoring CPU Usage


The uptime, mpstat, sar, dstat, and top utilities enable you to monitor CPU usage. When all of a
system's CPU cores are occupied with executing code processes, other processes must wait until a CPU
core becomes free or when the scheduler switches a CPU core to run its code. If too many processes are
queued too often, a bottleneck in the performance of the system can occur.

The mpstat -P ALL and sar -u -P ALL commands display CPU usage statistics for each CPU core
and is averaged across all of the CPU cores.

The %idle value shows the percentage of time that a CPU was not running system or process code. If the
value of %idle is near 0% most of the time on all CPU cores, the system is CPU-bound for the workload
that it is running. The percentage of time spent running system code (%systemor %sys) should not usually
exceed 30%, especially if %idle is close to 0%.

The system load average represents the number of processes that are running on CPU cores, waiting to
run, or waiting for disk I/O activity to complete averaged over a period of time. On a busy system, the load
average reported by uptime or sar -q should usually be not greater than two times the number of CPU
cores over periods as long as 5 or 15 minutes. If the load average exceeds four times the number of CPU
cores for long periods, the system is overloaded.

In addition to load averages (ldavg-*), the sar -q command reports the number of processes currently
waiting to run (the run-queue size, runq-sz) and the total number of processes (plist_sz). The
value of runq-sz also provides an indication of CPU saturation.

Determine the system's average load under normal loads where users and applications do not experience
problems with system responsiveness, and then look for deviations from this benchmark over time. A
dramatic rise in the load average can indicate a serious performance problem.

A combination of sustained large load average or large run queue size and low %idle can indicate that the
system has insufficient CPU capacity for the workload. When CPU usage is high, use a command such as
dstat or top to determine which processes are most likely to be responsible. For example, the following
dstat command shows which processes are using CPUs, memory, and block I/O most intensively:

10
Monitoring Memory Usage

# dstat --top-cpu --top-mem --top-bio

The top command provides a real-time display of CPU activity. By default, top lists the most CPU-
intensive processes on the system. In its upper section, top displays general information including the load
averages over the past 1, 5 and 15 minutes, the number of running and sleeping processes (tasks), and
total CPU and memory usage. In its lower section, top displays a list of processes, including the process
ID number (PID), the process owner, CPU usage, memory usage, running time, and the command name.
By default, the list is sorted by CPU usage, with the top consumer of CPU listed first. Type f to select
which fields top displays, o to change the order of the fields, or O to change the sort field. For example,
entering On sorts the list on the percentage memory usage field (%MEM).

2.2.4 Monitoring Memory Usage


The sar -r command reports memory utilization statistics, including %memused, which is the percentage
of physical memory in use.

sar -B reports memory paging statistics, including pgscank/s, which is the number of memory pages
scanned by the kswapd daemon per second, and pgscand/s, which is the number of memory pages
scanned directly per second.

sar -W reports swapping statistics, including pswpin/s and pswpout/s, which are the numbers of
pages per second swapped in and out per second.

If %memused is near 100% and the scan rate is continuously over 200 pages per second, the system has a
memory shortage.

Once a system runs out of real or physical memory and starts using swap space, its performance
deteriorates dramatically. If you run out of swap space, your programs or the entire operating system are
likely to crash. If free or top indicate that little swap space remains available, this is also an indication
you are running low on memory.

The output from the dmesg command might include notification of any problems with physical memory that
were detected at boot time.

2.2.5 Monitoring Block I/O Usage


The iostat command monitors the loading of block I/O devices by observing the time that the devices
are active relative to the average data transfer rates. You can use this information to adjust the system
configuration to balance the I/O loading across disks and host adapters.

iostat -x reports extended statistics about block I/O activity at one second intervals, including %util,
which is the percentage of CPU time spent handling I/O requests to a device, and avgqu-sz, which is
the average queue length of I/O requests that were issued to that device. If %util approaches 100% or
avgqu-sz is greater than 1, device saturation is occurring.

You can also use the sar -d command to report on block I/O activity, including values for %util and
avgqu-sz.

The iotop utility can help you identify which processes are responsible for excessive disk I/O. iotop has
a similar user interface to top. In its upper section, iotop displays the total disk input and output usage in
bytes per second. In its lower section, iotop displays I/O information for each process, including disk input
output usage in bytes per second, the percentage of time spent swapping in pages from disk or waiting
on I/O, and the command name. Use the left and right arrow keys to change the sort field, and press A to
toggle the I/O units between bytes per second and total number of bytes, or O to toggle between displaying
all processes or only those processes that are performing I/O.

11
Monitoring File System Usage

2.2.6 Monitoring File System Usage


The sar -v command reports the number of unused cache entries in the directory cache (dentunusd)
and the numbers of in-use file handles (file-nr), inode handlers (inode-nr), and pseudo terminals
(pty-nr).

nfsiostat reports I/O statistics for each NFS file system that is mounted. If this command is not available
install the nfs-utils package.

2.2.7 Monitoring Network Usage


The ip -s link command displays network statistics and errors for all network devices, including
the numbers of bytes transmitted (TX) and received (RX). The dropped and overrun fields provide an
indicator of network interface saturation, for example:
# ip -s link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
RX: bytes packets errors dropped overrun mcast
240 4 0 0 0 0
TX: bytes packets errors dropped carrier collsns
240 4 0 0 0 0
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen
link/ether 08:00:27:60:95:d5 brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped overrun mcast
258187485 671730 0 0 0 17296
TX: bytes packets errors dropped carrier collsns
13227598 130827 0 0 0 0

The ss -s command displays summary statistics for each protocol, for example:
# ss -s
Total: 193
TCP: 9 (estab 2, closed 0, orphaned 0, timewait 0)

2.2.8 Working With the Graphical System Monitor


The GNOME desktop environment includes a graphical system monitor that allows you to display
information about the system configuration, running processes, resource usage, and file systems.

To display the System Monitor, use the following command:


# gnome-system-monitor

Selecting the Resources tab displays the following information:

• CPU usage history in graphical form and the current CPU usage as a percentage.

• Memory and swap usage history in graphical form and the current memory and swap usage.

• Network usage history in graphical form, the current network usage for reception and transmission, and
the total amount of data received and transmitted.

To display the System Monitor Manual, press F1 or select Help, then select Contents.

2.3 Working With OSWatcher Black Box


Oracle OSWatcher Black Box (OSWbb) collects and archives operating system and network metrics that
you can use to diagnose performance issues. OSWbb operates as a set of background processes on the

12
Installing OSWbb

server and gathers data on a regular basis, invoking such Unix utilities as vmstat, mpstat, netstat,
iostat, and top.

OSWbb is particularly useful for Oracle RAC (Real Application Clusters) and Oracle Grid Infrastructure
configurations. The RAC-DDT (Diagnostic Data Tool) script file includes OSWbb, but does not install it by
default.

2.3.1 Installing OSWbb


To install OSWbb:

1. Log on to My Oracle Support at https://support.oracle.com.

2. Download OSWatcher from the link that is listed by Doc ID 301137.1 at https://support.oracle.com/
epmos/faces/DocumentDisplay?id=301137.1.

3. Copy the file to the directory where you want to install OSWbb, the run the following command:
# tar xvf oswbbVERS.tar

where VERS represents the version number of OSWatcher, for example 832 for OSWatcher 8.32.

Extracting the tar file creates a directory named oswbb, which contains all the directories and files that
are associated with OSWbb, including the startOSWbb.sh script.

4. To enable the collection of iostat information for NFS volumes, edit the OSWatcher.sh script in the
oswbb directory, and set the value of nfs_collect to 1 as follows:
nfs_collect=1

2.3.2 Running OSWbb


To start OSWbb, run the startOSWbb.sh script from the oswbb directory.
# ./startOSWbb.sh [frequency duration]

The optional frequency and duration arguments specifying how often in seconds OSWbb should collect
data and the number of hours for which OSWbb should run. The default values are 30 seconds and 48
hours. The following example starts OSWbb recording data at intervals of 60 seconds, and has it record
data for 12 hours:
# ./startOSWbb.sh 60 12
...
Testing for discovery of OS Utilities...
VMSTAT found on your system.
IOSTAT found on your system.
MPSTAT found on your system.
IFCONFIG found on your system.
NETSTAT found on your system.
TOP found on your system.

Testing for discovery of OS CPU COUNT


oswbb is looking for the CPU COUNT on your system
CPU COUNT will be used by oswbba to automatically look for cpu problems

CPU COUNT found on your system.


CPU COUNT = 4

Discovery completed.

Starting OSWatcher Black Box v7.3.0 on date and time

13
Analysing OSWbb Archived Files

With SnapshotInterval = 60
With ArchiveInterval = 12
...
Data is stored in directory: OSWbba_archive

Starting Data Collection...

oswbb heartbeat: date and time


oswbb heartbeat: date and time + 60 seconds
...

where OSWbba_archive is the path of the archive directory that contains the OSWbb log files.

To stop OSWbb prematurely, run the stopOSWbb.sh script from the oswbb directory:
# ./stopOSWbb.sh

OSWbb collects data in the directories that are under the oswbb/archive directory, which are described
in the following table.

Directory Description
oswifconfig Contains output from ifconfig.
oswiostat Contains output from iostat.
oswmeminfo Contains a listing of the contents of /proc/meminfo.
oswmpstat Contains output from mpstat.
oswnetstat Contains output from netstat.
oswprvtnet If you have enable private network tracing for RAC, contains information about the
status of the private networks.
oswps Contains output from ps.
oswslabinfo Contains a listing of the contents of /proc/slabinfo.
oswtop Contains output from top.
oswvmstat Contains output from vmstat.

OSWbb stores data in hourly archive files, which are named


system_name_utility_name_timestamp.dat. Each entry in a file is preceded by a timestamp.

2.3.3 Analysing OSWbb Archived Files


You can use the OSWbb analyzer (OSWbba) to provide information about system slowdowns, system
hangs, and other performance problems. You can also use OSWbba to graph data that is collected from
the iostat, netstat, and vmstat utilities. OSWbba requires that you have Java version 1.4.2 or a later
version installed on your system.

You can download a Java RPM for Linux by visiting http://www.java.com, or you can install Java by using
the dnf command:
# dnf install java-1.8.0-jdk

Run OSWbba from the oswbb directory as follows:


# java -jar oswbba.jar -i OSWbba_archive

where OSWbba_archive is the path of the archive directory that contains the OSWbb log files.

You can use OSWbba to display the following types of performance graph:

14
Analysing OSWbb Archived Files

• Process run, wait and block queues.

• CPU time spent running in system, user, and idle mode.

• Context switches and interrupts.

• Free memory and available swap.

• Reads per second, writes per second, service time for I/O requests, and percentage utilization of
bandwidth for a specified block device.

You can also use OSWbba to save the analysis to a report file, which reports instances of system
slowdown,spikes in run queue length, or memory shortage, describes probable causes, and offers
suggestions of how to improve performance.
# java -jar oswbba.jar -i OSWbba_archive -A

For more information about OSWbb and OSWbba, refer to the OSWatcher Black Box User Guide (Article
ID 301137.1) and the OSWatcher Black Box Analyzer User Guide (Article ID 461053.1) on My Oracle
Support at https://support.oracle.com.

15
16
Chapter 3 Automating System Tasks

Table of Contents
3.1 About Auditing the System .......................................................................................................... 17
3.2 Working With System Log files ................................................................................................... 18
3.2.1 About Logging Configuration (/etc/rsyslog.conf) ................................................................. 18
3.2.2 Configuring Logwatch ....................................................................................................... 22
3.3 Using Process Accounting .......................................................................................................... 22

This chapter describes system auditing, auditing configuration, auditing processes and reporting. This
chapter also describes how to use system log files to track information about your system, and how to use
process accounting to monitor system processes.

3.1 About Auditing the System


Auditing collects data at the kernel level that you can then analyze to identify unauthorized activity.
Auditing collects data in greater detail than system logging does; but, note that most audited events are
uninteresting and insignificant. Because the process of examining audit trails to locate events of interest
can be a significantly challenging, you may consider automating this process.

The audit configuration file, /etc/audit/auditd.conf, defines the following:

• Data retention policy

• Maximum size of the audit volume

• Action to take if the capacity of the audit volume is exceeded

• Locations of local and remote audit trail volumes

The default audit trail volume is /var/log/audit/audit.log. See the auditd.conf(5) manual page
for more information.

By default, auditing captures specific events such as system logins, modifications to accounts, and sudo
actions. You can also configure auditing to capture detailed system call activity or modifications to certain
files. The kernel audit daemon (auditd) records the events that you configure, including the event type, a
time stamp, the associated user ID, and success or failure of the system call.

The entries in the audit rules file, /etc/audit/audit.rules, determine which events are audited. Each
rule is a command-line option that is passed to the auditctl command. You should typically configure
this file to match your site's security policy.

The following are examples of rules that you might set in the /etc/audit/audit.rules file:

Record all unsuccessful exits from open and truncate system calls for files in the /etc directory
hierarchy.
-a exit,always -S open -S truncate -F /etc -F success=0

To record all files opened by a user with UID 10:


-a exit,always -S open -F uid=10

17
Working With System Log files

To record all files that have been written to or files with their attributes changed by any user who originally
logged in with a UID of 500 or greater:
-a exit,always -S open -F auid>=500 -F perm=wa

To record requests for write or file attribute change access to the /etc/sudoers file and tag such a
record with the string sudoers-change:
-w /etc/sudoers -p wa -k sudoers-change

To record requests for write and file attribute change access to the /etc directory hierarchy:
-w /etc/ -p wa

To require a reboot after changing the audit configuration:


-e 2

If specified, this rule should appear at the end of the /etc/audit/audit.rules file.

More examples of audit rules can be found in the /usr/share/doc/audit-version/stig.rules file.


See also the auditctl(8) and audit.rules(7) manual pages.

Stringent auditing requirements can impose a significant performance overhead and generate large
amounts of audit data. Some site security policies stipulate that a system must shut down if events cannot
be recorded because the audit volumes have exceeded their capacity. As a general rule, you should direct
audit data to separate file systems in rotation to prevent overspill and to facilitate backups.

You can use the -k option to tag audit records so that you can locate them more easily in an audit volume
with the ausearch command. For example, to examine records tagged with the string sudoers-change,
you would enter:
# ausearch -k sudoers-change

The aureport command generates summaries of audit data. You can set up cron jobs that run
aureport periodically to generate reports of interest. For example, the following command generates a
reports that shows every login event from 1 second after midnight on the previous day until the current
time:
# aureport -l -i -ts yesterday -te now

See the ausearch(8) and aureport(8) manual pages for more information.

3.2 Working With System Log files


The log files contain messages about the system, kernel, services, and applications. The journald
logging daemon, which is part of systemd, records system messages in non-persistent journal files in
memory and in the /run/log/journal directory. journald forwards messages to the system logging
daemon, rsyslog. As files in /run are volatile, the log data is lost after a reboot unless you create the
directory /var/log/journal. You can use the journalctl command to query the journal logs.

For more information, see the journalctl(1) and systemd-journald.service(8) manual pages.

3.2.1 About Logging Configuration (/etc/rsyslog.conf)


The configuration file for rsyslogd is /etc/rsyslog.conf, which contains global directives, module
directives, and rules. By default, rsyslog processes and archives only syslog messages. If required,
you can configure rsyslog to archive any other messages that journald forwards, including kernel,
boot, initrd, stdout, and stderr messages.

18
About Logging Configuration (/etc/rsyslog.conf)

Global directives specify configuration options that apply to the rsyslogd daemon. All configuration
directives must start with a dollar sign ($) and only one directive can be specified on each line. The
following example specifies the maximum size of the rsyslog message queue:
$MainMsgQueueSize 50000

The available configuration directives are described in the file /usr/share/doc/rsyslog-version-


number/rsyslog_conf_global.html.

The design of rsyslog allows its functionality to be dynamically loaded from modules, which provide
configuration directives. To load a module, specify the following directive:
$ModLoad MODULE_name

Modules have the following main categories:

• Input modules gather messages from various sources. Input module names always start with the im
prefix (examples include imfile and imrelp).

• Filter modules allow rsyslogd to filter messages according to specified rules. The name of a filter
module always starts with the fm prefix.

• Library modules provide functionality for other loadable modules. rsyslogd loads library modules
automatically when required. You cannot configure the loading of library modules.

• Output modules provide the facility to store messages in a database or on other servers in a network, or
to encrypt them. Output module names always starts with the om prefix (examples include omsnmp and
omrelp).

• Message modification modules change the content of an rsyslog message.

• Parser modules allow rsyslogd to parse the message content of messages that it receives. The name
of a parser module always starts with the pm prefix.

• String generator modules generate strings based on the content of messages in cooperation with
rsyslog's template feature. The name of a string generator module always starts with the sm prefix.

Input modules receive messages, which pass them to one or more parser modules. A parser module
creates a representation of a message in memory, possibly modifying the message, and passes the
internal representation to output modules, which can also modify the content before outputting the
message.

A description of the available modules can be found at http://www.rsyslog.com/doc/


rsyslog_conf_modules.html.

An rsyslog rule consists of a filter part, which selects a subset of messages, and an action part,
which specifies what to do with the selected messages. To define a rule in the /etc/rsyslog.conf
configuration file, specify a filter and an action on a single line, separated by one or more tabs or spaces.

You can configure rsyslog to filter messages according to various properties. The following are the most
commonly used filters:

• Expression-based filters, written in the rsyslog scripting language, select messages according to
arithmetic, boolean, or string values.

• Facility/priority-based filters filter messages based on facility and priority values that take the form
facility.priority.

19
About Logging Configuration (/etc/rsyslog.conf)

• Property-based filters filter messages by properties such as timegenerated or syslogtag.

The following table describes the available facility keywords for facility or priority-based filters.

Facility Keyword Description


auth, authpriv Security, authentication, or authorization messages.
cron crond messages.
daemon Messages from system daemons other than crond and rsyslogd.
kern Kernel messages.
lpr Line printer subsystem.
mail Mail system.
news Network news subsystem.
syslog Messages generated internally by rsyslogd.
user User-level messages.
UUCP UUCP subsystem.
local0 - local7 Local use.

The following table describes the available priority keywords for facility or priority-based filters, in
ascending order of importance.

Priority Keyword Description


debug Debug-level messages.
info Informational messages.
notice Normal but significant condition.
warning Warning conditions.
err Error conditions.
crit Critical conditions.
alert Immediate action required.
emerg System is unstable.

All messages of the specified priority and higher are logged according to the specified action. An asterisk
(*) wildcard specifies all facilities or priorities. Separate the names of multiple facilities and priorities on a
line with commas (,). Separate multiple filters on one line with semicolons (;). Precede a priority with an
exclamation mark (!) to select all messages except those with that priority.

The following are examples of facility/priority-based filters.

Select all kernel messages with any priority as follows:


kern.*

Select all mail messages with crit or higher priority as follows:


mail.crit

Select all daemon and kern messages with warning or err priority as follows:
daemon,kern.warning,err

20
About Logging Configuration (/etc/rsyslog.conf)

Select all cron messages except those with info or debug priority as follows:
cron.!info,!debug

By default, /etc/rsyslog.conf includes the following rules:


# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.* /dev/console

# Log anything (except mail) of level info or higher.


# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none /var/log/messages

# The authpriv file has restricted access.


authpriv.* /var/log/secure

# Log all the mail messages in one place.


mail.* -/var/log/maillog

# Log cron stuff


cron.* /var/log/cron

# Everybody gets emergency messages


*.emerg *

# Save news errors of level crit and higher in a special file.


uucp,news.crit /var/log/spooler

# Save boot messages also to boot.log


local7.* /var/log/boot.log

You can send the logs to a central log server over TCP by adding the following entry to the forwarding
rules section of /etc/rsyslog.conf on each log client:
*.* @@logsvr:port

where logsvr is the domain name or IP address of the log server and port is the port number (usually,
514).

On the log server, add the following entry to the MODULES section of /etc/rsyslog.conf:
$ModLoad imtcp
$InputTCPServerRun port

where port corresponds to the port number that you set on the log clients.

To manage the rotation and archival of the correct logs, edit /etc/logrotate.d/syslog so that it
references each of the log files that are defined in the RULES section of /etc/rsyslog.conf. You can
configure how often the logs are rotated and how many past copies of the logs are archived by editing /
etc/logrotate.conf.

It is recommended that you configure Logwatch on your log server to monitor the logs for suspicious
messages, and disable Logwatch on log clients. However, if you do use Logwatch, disable high precision
timestamps by adding the following entry to the GLOBAL DIRECTIVES section of /etc/rsyslog.conf
on each system:
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat

For more information, see the logrotate(8), logwatch(8), rsyslogd(8) and rsyslog.conf(5)
manual pages. See also the HTML documentation in the /usr/share/doc/rsyslog-5.8.10 directory
and the documentation at http://www.rsyslog.com/doc/manual.html.

21
Configuring Logwatch

3.2.2 Configuring Logwatch


Logwatch is a monitoring system that you can configure to report on areas of interest in the system logs.
After you install the logwatch package, the /etc/cron.daily/0logwatch script runs every night
and sends an email report to root. You can set local configuration options in /etc/logwatch/conf/
logwatch.conf that override the main configuration file /usr/share/logwatch/default.conf/
logwatch.conf, including the following:

• Log files to monitor, including log files that are stored for other hosts.

• Names of services to monitor, or to be excluded from monitoring.

• Level of detail to report.

• User to be sent an emailed report.

You can also run logwatch directly from the command line.

For more information, see the logwatch(8) manual page.

3.3 Using Process Accounting


The psacct package implements the process accounting service in addition to the following utilities that
you can use to monitor process activities:

ac Displays connection times in hours for a user as recorded in the wtmp


file (by default, /var/log/wtmp).

accton Turns on process accounting to the specified file. If you do not specify a
file name argument, process accounting is stopped. The default system
accounting file is /var/account/pacct.

lastcomm Displays information about previously executed commands as recorded


in the system accounting file.

sa Summarizes information about previously executed commands as


recorded in the system accounting file.

Note

As for any logging activity, ensure that the file system has enough space to
store the system accounting and wtmp files. Monitor the size of the files and, if
necessary, truncate them.

For more information, see the ac(1), accton(8), lastcomm(1), and sa(8) manual pages.

22
Chapter 4 Working With Kernel Dumps

Table of Contents
4.1 About Kdump ............................................................................................................................. 23
4.2 Kdump Installation and Configuration .......................................................................................... 23
4.2.1 Files That Are Used by Kdump ........................................................................................ 24
4.2.2 Installing and Configuring Kdump ..................................................................................... 24
4.2.3 Configuring the Kdump Output Location ............................................................................ 25
4.2.4 Configuring the Default Kdump Failure State ..................................................................... 25
4.3 Analyzing Kdump Output ............................................................................................................ 26
4.4 Using Early Kdump ..................................................................................................................... 26

This chapter provides information about the Kdump feature and describes how to configure a system
to create a memory image, in the event of a system crash. The chapter also describes how to use the
crash utility to interactively analyze the state of a running Oracle Linux system or after a kernel crash has
occurred.

4.1 About Kdump


The Kdump feature provides a kernel crash dumping mechanism in Oracle Linux. The kdump service
enables you to save the contents of the system’s memory for later analysis. The second kernel resides in a
reserved part of the system memory.

Kdump uses the kexec system call to boot into the second kernel, called a capture kernel, without the
need to reboot the system, and then captures the contents of the crashed kernel’s memory as a crash
dump (vmcore) and saves it. The vmcore crash dump can help with determining the cause of the crash.

Oracle recommends that you enable the Kdump feature because a crash dump might be the only
information that is available if a system failure occurs. Ensuring that Kdump is enabled is critical in many
mission-critical environments.

Prior to enabling Kdump, ensure that your system meets all of the memory requirements for using Kdump.
To capture a kernel crash dump and save it for further analysis, you must permanently reserve part of
the system's memory for the capture kernel. Note that this part of the system's memory will no longer be
available to the main kernel. The following table lists the minimum amount of reserved memory that is
required to use Kdump, based on the system's architecture and the amount of available memory.

Architecture Available Memory Minimum Reserved Memory


x86_64 1 GB to 64 GB 160 MB of RAM
64 GB to 1 TB 256 MB of RAM
1 TB and more 512 MB of RAM
Arm (aarch64) 2 TB and more 512 MB of RAM

4.2 Kdump Installation and Configuration


This section describes how to install and configure Kdump by using the command line.

For information about configuring Kdump by using the Cockpit web console, see Oracle Linux: Use Cockpit
to Configure Kdump

23
Files That Are Used by Kdump

4.2.1 Files That Are Used by Kdump


When you install and configure Kdump, the following files are modified:

/boot/grub2/grub.cfg Appends the crashkernel option to the kernel line to specify the
amount of reserved memory and any offset value.

/etc/kdump.conf Sets the location where the dump file can be written, the filtering level
for the makedumpfile command, and the default behavior to take if
the dump fails. See the comments in the file for information about the
supported parameters.

When you edit these files, you must reboot the system for the changes to take effect.

For more information, see the kdump.conf(5) manual page.

4.2.2 Installing and Configuring Kdump


During an Oracle Linux interactive installation with the graphical installer, you have the option to enable
Kdump and specify how much system memory is reserved for Kdump. The installer screen is titled Kdump
and is available from the main Installation Summary screen of the installer.

If you do not enable Kdump at installation time, or it is not enabled by default during an installation, as in
the case of a custom kickstart installation, you can install and enable the feature by using the command
line.

Before you install and configure Kdump by using the command line, ensure that your system meets all of
the necessary memory requirements. For details, see Section 4.1, “About Kdump”.

1. If the kdump package is not already installed on your system, install it:
# dnf install kexec-tools

2. As the root user, edit the /etc/default/grub file and set the crashkernel= option to the
required value.

For example, you would reserve 64 MB of memory as follows:


crashkernel=64M

You can also set the amount of reserved memory as a variable by using the following syntax:
crashkernel=range1:size1,range2:size2

For example, you might set the memory as a variable as follows:


crashkernel=512M-2G:64M,2G-:128M

3. (Optional) If necessary, offset the reserved memory.

Because the crashkernel reservation occurs very early, some systems require that you reserve memory
with a certain fixed offset. When a fixed offset is specified, the reserved memory begins at that point.
For example, you would reserve 128 MB of memory, starting at 16 MB as follows:
crashkernel=128M@16M

Note that if no offset parameter is set, Kdump offsets reserved memory automatically.

4. Refresh the grub configuration to apply your changes:

24
Configuring the Kdump Output Location

# grub2-mkconfig -o /boot/grub2/grub.cfg

5. Reboot the system and finish configuring Kdump.

For instructions, see Section 4.2.3, “Configuring the Kdump Output Location”.

6. When you have finished configuring Kdump, enable the kdump service:
# systemctl enable --now kdump.service

4.2.3 Configuring the Kdump Output Location


After installing Kdump, you can define where the resulting output should be saved. In Oracle Linux, Kdump
files are stored in the /var/crash directory by default.

To save the result to other locations, such as NFS mounts, externally mounted drives, and remote file
servers, edit the /etc/kdump.conf file and remove the # comment character at the beginning of each
line that you want to enable.

For example, to add a new directory location, prefix it with the path keyword:
path /usr/local/cores

Use raw to output directly to a specific device in the /dev directory. You can also manually specify the
output file system for a particular device by using its label, name or UUID, for example:
ext4 UUID=5b065be6-9ce0-4154-8bf3-b7c4c7dc7365

Kdump files can also be transferred over a secure shell connection, as shown in the following example:
ssh [email protected]
sshkey /root/.ssh/mykey

It is also possible to export the result to a compatible network share:


nfs example.com:/output

When you have finished configuring the output location for Kdump, enable the kdump service as follows:
# systemctl enable --now kdump.service

4.2.4 Configuring the Default Kdump Failure State


By default, if kdump fails to output its result to the configured outlook locations, it reboots the server. This
action deletes any data that has been collected for the dump, so you should uncomment and change the
default value in the /etc/kdump.conf file as follows:
default dump_to_rootfs

The dump_to_rootfs option attempts to save the result to a local directory, which can be particularly
useful if a network share is unreachable. Using shell instead enables you to copy the data manually from
the command line.

Note

The poweroff, restart, and haltoptions are also valid for the default kdump
failure state; but, you will lose the collected data if those actions are performed.

25
Analyzing Kdump Output

4.3 Analyzing Kdump Output


The crash utility provides a shell prompt that enables you to analyze the contents of your kdump core
dumps, which is particularly useful when troubleshooting problems.

1. If the crash package is not installed on your system, install it:


# dnf install crash

2. Identify the currently running kernel, for example:


# uname -r
4.18.0-80.el8.x86_64

3. Provide the location of the kernel debuginfo module and the location of the core dump as parameters to
the crash utility, for example:
# crash /usr/lib/debug/lib/modules/4.18.0-80.el8.x86_64/vmlinux \
/var/crash/127.0.0.1-2019-10-28-12:38:25/vmcore

where 4.18.0-80.el8.x86_64 is the currently running kernel and


127.0.0.1-2019-10-28-12:38:25 represents the ipaddress-timestamp.

4. Inside the crash shell, you can use the help log command to better understand how to use the log
command.

You can also use the bt, ps, vm, and files commands to get more information about the core dump.

5. When you have finished analyzing the core dump, exit the shell.

For more detailed information about using the crash utility, see the crash(8) manual page.

4.4 Using Early Kdump


New in Oracle Linux 8, early Kdump enables the crash kernel and initramfs to load early enough to capture
vmcore information for early crashes.

Because the kdump service starts too late, early crashes are not able to use normal kdump kernel booting.
As a result, information about early crashes is lost. To address this issue, you can enable early Kdump by
adding a dracut module so that the crash kernel and initramfs are loaded as early as possible. When early
Kdump is enabled, the files are loaded just like a normal kdump, which is disabled by default.

Note that early Kdump does not support Fadump currently.

For more information about configuring early Kdump on your Oracle Linux 8 systems, see the step-by-step
instructions in the /usr/share/doc/kexec-tools/early-kdump-howto.txt file.

26

You might also like