Linux Magazine - January 2024 USA

Download as pdf or txt
Download as pdf or txt
You are on page 1of 100

2

I C Flight Simulator Interface

FR D
+

DV
EE
on a Raspberry Pi

ISSUE 278 – JANUARY 2024

Scientific
Computing
with a Bitcoin mining rig

Acoustic Keyloggers R Programming


Watch out for tools that Get started with this
listen to keystrokes powerful scientific language
PyScript Waydroid: Run your Android
Python in a browser apps on Linux
Bond your NICs
Faster together, but
test as you go
W W W. L I N U X - M A G A Z I N E . C O M
10 TANTALIZING
FOSS FINDS!
EDITORIAL
Welcome

DEUCE COUPS
Dear Reader,
What a busy weekend in tech news. On Friday, we heard Of course that is the charitable view of the board’s action.
that OpenAI, creators of ChatGPT, had fired CEO Sam A darker (and equally speculative) view is that nonprofit
Altman, and by Monday, he had already found a new job boards can sometimes be highly dysfunctional, with a lot
at Microsoft, along with cofounder Greg Brockman. More of their own internal power games and politics, and
than 700 OpenAI employees signed a letter saying they maybe the intrepid Altman was simply unable to steer
would quit – and quite possibly jump to Microsoft – if the around a raging Charybdis of group think.
OpenAI board didn’t hire Altman back and resign. Microsoft The whole story hung in a state of uncertainty for two
said Altman and Brockman would lead Microsoft’s new ad- days; then lightening struck again: OpenAI hired Altman
vanced AI research team. OpenAI, on the other hand, went back. Was this a third coup, or the undoing of a previous
into free fall, announcing an interim CEO whose tenure coup? Microsoft gave the new plan its full support. OpenAI
lasted for two days before another CEO was named. ditched three of the four board members who voted for
Wall Street was very happy for Microsoft, driving the share Altman’s ouster (including the only two women), and the
price to a record high. Meanwhile, OpenAI was roundly new board has pledged a full investigation into what hap-
condemned – both for firing Altman and for the way they pened. We might need to wait for that report to know all
did it. The word on the street was that Microsoft pulled off the details of the internal struggle that led to this unex-
a “coup” by snagging Altman, Brockman, and whoever pected whiplash festival, but one thing seems clear:
else they can pull over. Altman and others also referred to Altman and the full-steam-ahead faction is the winner and
his ousting by the OpenAI board as a “coup,” with a very the proceed-with-caution faction is out in the cold. Ousted
different spin on the term. Two coups in four days is a lot – board member Helen Toner, for instance, recently co-au-
even at the frenetic pace of IT. thored a paper that warned of a possible “race to the
bottom,” in the AI industry, “in which multiple players feel
From a business viewpoint, Microsoft was simply capitaliz-
pressure to neglect safety and security challenges in order
ing on an opportunity – and acting to protect their invest-
to remain competitive” [1]. Some are now saying that
ment, because they had acquired a large stake in OpenAI
paper helped to stir up the skirmish in the first place.
earlier this year and couldn’t afford to watch the company
self-destruct. But it is worth pointing out that this really isn’t Why did Microsoft let Altman go back? It isn’t like them
all from a business viewpoint. OpenAI is actually ruled by a to surrender the spoils of victories. Keep in mind that the
nonprofit board controlling a for-profit subsidiary. The ques- competition is heating up. Amazon just announced its
tion of what is better for OpenAI’s business interests, which Olympus AI initiative, and Google, Meta, and several other
seems to be the fat that everyone is chewing on, might not tech giants are all working on their own AI projects. Micro-
be the best context for understanding these events. soft is already committed to building OpenAI’s technology
into its own products, and they might have realized that,
Altman’s disagreement with the board appears to have
by the time the exiles settle into their new workspace and
been about the pace of development and the safety of
get down to training models and producing real software,
the tools the company has developed. OpenAI’s vision is
their head start might already be gone.
supposed to be to develop AI “for the benefit of human-
ity,” which is very admirable, but it leaves lots of room OpenAI has regained its footing as a business, but as a
for interpretation. Altman, in particular, has occupied an nonprofit devoted to serving humanity,
ambiguous space in the press, at once warning about the it appears to have fallen off its ped-
dangers of AI and simultaneously pledging to press estal, or at least, dropped down to a
ahead with development. No doubt he felt confident that lower pedestal. I fear the biggest
he was laying down sufficient guardrails along the way, loser in all this might be the opti-
but that is something to communicate with your board mistic OpenAI vision of a nonprofit
about, and it sounds like he wasn’t communicating to innovator taking a principled stand
their satisfaction. Should the board have trusted him and for methodical and safe develop-
let him forge ahead, knowing that the company was on a ment of these revolutionary tools.
roll and potentially on the verge of further innovations? Note to governments: Now might
If they were a garden-variety corporate board, possibly be a good time to provide some
yes, but as a board member of a nonprofit, you are really meaningful restraints for the AI
supposed to have more on your mind than power and industry – don’t expect them
money. You’re supposed to know when to say “no,” to police themselves.
even if it annoys everyone and stirs up some turmoil.

Info
[1] “Decoding Intentions: Artificial Intelligence and Costly Signals,”
by Andrew Imbrie, Owen J. Daniels, and Helen Toner: Joe Casad,
https://cset.georgetown.edu/wp-content/uploads/ Editor in Chief
CSET-Decoding-Intentions.pdf

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 3


JANUARY 2024

ON THE COVER
26 R for Science 54 PyScript 69 RPi Flight Simulator
This easy-to-learn language Versatile solution for Interface
comes with powerful tools putting Python in a browser. Explore the I2C interface with
for data analysis. this high-flying maker project.
38 Acoustic Keyloggers 65 Teaming NICs 90 Waydroid
Sneaky tools that gather Bundle your network Access Android apps from
information from the sound adapters to speed up remote your Linux desktop.
of typing. access.

NEWS IN-DEPTH
8 News 36 AlmaLinux
• AlmaLinux Will No Longer Be “Just Another RHEL Clone” Recent policy changes at Red Hat have upturned the RHEL
• OpenELA Releases Enterprise Linux Source Code clone community. AlmaLinux charts a new path by shifting
• StripedFly Malware Hiding in Plain Sight as a to binary compatibility and away from being a
Cryptocurrency Miner downstream RHEL build.
• Experimental Wayland Support Planned for Linux Mint 21.3
• KDE Plasma 6 Sets Release Date 38 Acoustic Keyloggers
• Gnome Developers in Discussion to End Support for X.Org Is someone listening in on your typing? Learn more about
how acoustic keyloggers work.
12 Kernel News
• Avoiding Bloat in the Kernel That Does Everything 46 Command Line – neofetch
• Particularly Odd Occurrences of Stardust Display information about your hardware, operating
system, and desktop in visually appealing output.

COVER STORIES 48 datamash


This data processor for your scripts makes long, complex
16 Science on a Crypto Rig calculations simple.
Could a once-impressive Bitcoin mining rig have a second
life in scientific computing? 54 PyScript
Use your favorite Python libraries on client-side web
22 Data Science Methods pages.
We tour some important tools for gaining insights from
mountains of data. 60 Programming Snapshot – Go CGI Scripting
Mike Schilli steps on the scale every week and records his
26 R for Science weight fluctuations as a time series. To help monitor his
The R programming language is a universal tool for data progress, he writes a CGI script in Go that stores the data
analysis and machine learning. and draws visually appealing charts.

65 Teaming NICs
REVIEWS Combining your network adapters can speed up network
performance – but a little more testing could lead to better
32 Distro Walk – Immutable Distros choices.
Immutable distributions offer a layer of added security.
Bruce explains how immutable systems work and discusses
their benefits and drawbacks.

4 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


Scientific
Computing
A crypto mining rig is built for math.
Can an old rig find a second life
79 Welcome
solving science problems? That all This month in Linux Voice.
depends on the problem. Also this
80 Doghouse – What is Fun?
month, we explore a few popular This month maddog writes about what makes free
data analysis techniques and stir up software fun for him.

some analysis of our own with the R 81 Compressing Files with RAR
programming language. The non-free RAR compression tool offers some
benefits you won’t find with ZIP and TAR.

84 FOSSPicks
MakerSpace This month Graham looks at osci-render, Spacedrive,
internetarchive, LibrePCB 1.0.0, and more!

69 RPi Flight Simulator Interface 90 Tutorial – Waydroid


A Raspberry Pi running Linux with a custom I 2C card and a Waydroid brings Android apps to the Linux desktop in
small power supply provides an interface for a real-time a simple and effective way.
flight simulator.

74 BCPL 95 Back Issues 97 Call for Papers


The BCPL procedural structured 96 Events 98 Coming Next Month
programming language is
fast, reliable, and efficient,
offering a wide range of
software libraries and
system functions.

@linuxmagazine

@linuxpromagazine TWO TERRIFIC DISTROS


DOUBLE-SIDED DVD!
Linux Magazine
SEE PAGE 6 FOR DETAILS
@linux_pro

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 5


DVD
This Month’s DVD

Kubuntu 23.10 and Fedora 39


Two Terrific Distros on a Double-Sided DVD!

Kubuntu 23.10 Fedora 39


64-bit 64-bit
Kubuntu is the Ubuntu variant that comes with the Fedora 39 marks the 20th year of Fedora releases. As
KDE desktop. The latest release, codenamed Mantic a mature operating system, Fedora 39 has few major
Minotaur, ships with KDE 5.27. The Kubuntu team changes, but it does offer the first look at many small
says the Kubuntu 5.27 release “brings massive enhancements to performance and the user experience
improvements to the desktop and all its tools.” that will be used in CentOS Stream and Red Hat Enter-
Plasma comes with a new configuration wizard, as prise Linux. However, a previously announced web-
well as “a window tiling system, a more stylish app based installer program has been delayed until Fedora 40.
theme, cleaner and more usable tools, and widgets Meanwhile, Fedora 39 offers the usual upgrades in the
that give you more control over your machine.” kernel and standard desktop applications such as Libre-
Included in the release are major updates to Krunner, Office and Gnome Boxes. Among the performance en-
the Discover software manager, and many of hancements are default hardware-accelerated video
Plasma’s most popular panels, trays, and widgets, decoding, multithreaded thumbnails for images, and
such as the digital clock and color picker. improved search performance in Gnome and the file
The Ubuntu base underneath Kubuntu comes with manager. Users may also notice a color-coded Bash
Linux kernel 6.5, in addition to GCC 13.2.0 and prompt, as well as enhancements contained in Gnome
several other updates to developer tools. Expert 45, such as a more detailed workspace window and a
users can also choose the experimental ZFS PipeWire-based camera app, a rewritten Image Viewer
filesystem and TPM-based disk encryption. app, and new desktop widgets. Such changes continue
Fedora’s long tradition of a user-friendly experience
suitable for all levels of users.

Defective discs will be replaced. Please send an email to [email protected].


Although this Linux Magazine disc has been tested and is to the best of our knowledge free of malicious software and defects, Linux Magazine
cannot be held responsible and is not liable for any disruption, loss, or damage to data and computer systems related to the use of this disc.

6 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


NEWS
Updates on technologies, trends, and tools
THIS MONTH’S NEWS
08 • AlmaLinux Will No Longer
Be “Just Another RHEL
Clone” AlmaLinux Will No Longer Be “Just Another
• elementary OS 8 Has a RHEL Clone”
Big Surprise in Store

09 • OpenELA Releases
Enterprise Linux Source
As my favorite band, Rush, once said, in Circumstances, “plus ça change, plus c’est
la même chose.” In other words, the more that things change, the more they stay
the same.
Code
• StripedFly Malware But this time around, AlmaLinux isn’t happy with staying the same… especially
Hiding in Plain Sight as with regards to remaining in lockstep with Red Hat Enterprise Linux (RHEL).
a Cryptocurrency Miner With the upcoming release of AlmaLinux 9.3, those who have become fans of the
• More Online distribution should expect change. This new release will not rely on RHEL Linux
source code. Instead, AlmaLinux 9.3 is built from the CentOS Stream repositories,

10 • Experimental Wayland
Support Planned for
which is upstream from RHEL.
What does this mean for users? AlmaLinux 9.3 will most likely not change all that
Linux Mint 21.3 much. The distribution will continue supporting x86_64, aarch64, ppc64le, and
• Window Maker Live s390x architectures and will likely no longer release days after RHEL.
0.96.0-0 Released According to benny Vasquez (https://almalinux.org/blog/future-of-almalinux/ ),
• KDE Plasma 6 Sets AlmaLinux OS Foundation Chair, “For a typical user, this will mean very little change
Release Date in your use of AlmaLinux. Red Hat-compatible applications will still be able to run on
AlmaLinux OS, and your installs of AlmaLinux will continue to receive timely security
11 • Fedora Project and
Slimbook Release the
updates.”
“The most remarkable potential impact of the change is that we will no longer be
New Fedora Slimbook
held to the line of ‘bug-for-bug compatibility’ with Red Hat, and that means that we
• Gnome Developers in
can now accept bug fixes outside of Red Hat’s release cycle,” Vasquez continues.
DiscussiontoEndSupportfor
“While that means some AlmaLinux OS users may encounter bugs that are not in
X.Org
Red Hat, we may also accept patches for bugs that have not yet been accepted
upstream or shipped downstream.”
AlmaLinux 9.3 is now available to download (https://almalinux.org/get-almalinux/ ).

elementary OS 8 Has a Big Surprise in Store


Elementary OS has long been a favorite of mine. For years it was my go-to Linux
distribution, which came to a halt when I purchased my first System76 Thelio desk-
top. Even so, I’ve continued to admire from afar the work that goes into the OS.
And with the upcoming release, the development team plans to finally shift to the
Wayland display server by default.
This has been a long time coming, because Wayland is far superior and more
secure than X.Org.
Wayland isn’t the only change coming to elementary OS 8. According to the
team’s recent blog (https://blog.elementary.io/lets-talk-os-8/ ), version 8 of the OS
will also include the continued transition to GTK 4.
So far, the Captive Network Assistant, Initial Setup, and Videos app have already
made the transition (in their respective development branches), and the port for the
AppCenter is almost done.
The System Settings app and the indicator area will also see some major changes,
making them both more modern and responsive. In addition, the development team

8 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


NEWS
Linux News

is considering an immutable version of elementary OS, adding Pipewire, replacing


the onscreen keyboard, and even reevaluating the systemd boot. MORE ONLINE
Of course, not everything will make it into version 8, but it looks like the team has
their work cut out for them.
If you’d like to get early access to daily builds, you can do so by becoming an Linux Magazine
elementary OS sponsor on GitHub (https://github.com/sponsors/elementary). www.linux-magazine.com

ADMIN HPC
OpenELA Releases Enterprise Linux http://www.admin-magazine.com/HPC/
Source Code Managing Storage with LVM
• Jeff Layton
OpenELA was formed by CIQ (the company behind Rocky Linux), Oracle, and SUSE Managing Linux storage servers with the
with a singular purpose: “... to encourage the development of distributions compatible Linux Logical Volume Manager.
with Red Hat Enterprise Linux (RHEL) by providing open and free enterprise Linux
source code.” And the initial release of the OpenELA source code is now available ADMIN Online
(https://github.com/openela-main). http://www.admin-magazine.com/
But why is this happening? According to CIQ (https://ciq.com/blog/ciq-oracle-and- Cost Management for Cloud Services
suse-launch-openela/), “The decision to establish OpenELA wasn’t made in isolation. • Holger Reibold
It was a direct answer to the challenges posed by Red Hat’s recent policy shifts. At Cost management for clouds, containers, and
CIQ, we’ve always believed in the power of collaboration and open access.” hybrid environments tends to be neglected
The site continues, “By teaming up with Oracle and SUSE, we’ll be able to provide for reasons of complexity. The open source
the community with the tools, resources, and most importantly, the source code they Koku software shows some useful approaches
need through OpenELA. With OpenELA, both upstream and downstream communities to this problem, although the current version
can fully leverage the potential of open source, from independent upstream projects still has some weaknesses.
through the delivery of compatible and standards-based Enterprise Linux derivatives.” Help Desk with FreeScout
The code (found at the prior OpenELA GitHub page link) contains all of the basic • Holger Reibold
packages for building an Enterprise Linux OS. Keep in mind, however, that the code is The free version of FreeScout offers all the
still very much a work in progress and some of the code has yet to be made public features of a powerful and flexible help desk
(due to OpenELA's continued removal of all Red Hat branding/trademarks). environment and can be adapted to your
At the moment, both Oracle and SUSE are planning on releasing their enterprise requirements with commercial add-ons.
distributions based on OpenELA, and the Rocky Linux Software Foundation is
How to Query Sensors for Helpful Metrics
considering the same.
• Andreas Stolzenberger
Discover the sensors that already exist
StripedFly Malware Hiding in Plain Sight on your systems, learn how to query their
information, and add them to your metrics
as a Cryptocurrency Miner dashboard.

Attention Linux Users: A malicious framework has been active for five years and has
been incorrectly classified as a Monero cryptocurrency miner.
StripedFly uses very sophisticated TOR-based methods to keep the malware hid-
den and uses worm-like capabilities to spread its nasty payload from Linux machine
to Linux machine (or Linux to Windows and vice versa).
No one is certain if StripedFly is being used for monetary purposes or straight-up
cybersecurity attacks (for information gathering). What is clear is that it’s an ad-
vanced persistent threat (APT) type of malware.
The earliest known version of StripedFly was identified in April 2016 and, since
then, it has infected more than a million systems. The StripedFly payload features a
customized TOR network client that works to obfuscate communication to a C2
(command and control) server, as well as the ability to disable SMBv1 and spread to
other hosts via SSH and EternalBlue.
When StripedFly infects a Linux system, it is named sd-pam and uses both
systemd services and a special .desktop file to keep it persistent. It also modifies
various Linux startup files such as /etc/rc*, .profile, .bashrc, and inittab.
You can read Kaspersky’s in-depth analysis of StripedFly at https://securelist.com/
stripedfly-perennially-flying-under-the-radar/110903/ . At the moment, patches to
mitigate against StripedFly have yet to be released for Linux, but you can be certain
your distribution of choice will be releasing the fix as soon as it is made available.
In the meantime, do everything you can to avoid phishing or visiting known mali-
cious websites, keep your systems up to date, and use a password manager.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 9


NEWS
Linux News

Experimental Wayland Support Planned


for Linux Mint 21.3
Although distributions such as Ubuntu and Fedora have fully committed to Wayland
(and are already shipping releases with it as the default display server protocol),
Linux Mint is a bit behind in the migration to Wayland.
Even with X.Org still suffering from numerous shortcomings and security issues,
some distributions have hesitated to make the switch. That’s understandable, be-
cause there are some desktop environments and even applications that have yet to
fully support Wayland.
That should change soon, because the Linux Mint team will release version 21.3
with experimental support for Wayland.
Before you get too excited, Wayland will not be the default X server on Linux Mint
21.3. Instead, users can select the Wayland session from the login screen.
It’s also important to understand that Wayland won’t be fully supported in 21.3,
because it’s not as stable on Mint as it is on X.Org. Do keep in mind that Wayland
does have issues with NVIDIA cards, so your mileage may vary should you desire
to test the new Wayland session.
Because this is Linux, for anyone who wants to keep tabs on the Linux Mint/Way-
land progress, you can check out the Trello board that is being used for the project,
https://trello.com/b/HHs01Pab/cinnamon-wayland. You can also read more about
this on the official Linux Mint blog (https://blog.linuxmint.com/?p=4591).

Window Maker Live 0.96.0-0 Released


Window Maker Live is alive and well and the new release, 0.96.0-0, is an updated
build of the Debian-based operating system.
Based on Debian 12.2, the new Window Maker Live release includes kernel 6.4.4 and
nearly the full range of GNUstep applications that are available via Debian Bookworm.
In this new release, the Window Maker root menu has been bolstered with a new
layout that includes a comprehensive listing of released programs, which are acces-
sible from the top-level GNUstep Apps entry.
As far as updated packages, the biggest update comes in the way of Window
Maker, which – like the Window Maker Live release number – is 0.96.0-0. This latest
release features hot corners, more configurable actions in WPrefs, libXRes as an
optional dependency, and support for _NET_WM_FULLSCREEN_MONITORS.
You’ll also find emacs 29.1, pcmanfm replaced with pcmanfm-qt, Greek added as
a supported language, gtk2-nocsd removed, and basic printer support has been
added via cups-pdf and system-config-printer.
In addition to the Claws Mail email client, you’ll find GNUmail has become avail-
able and the default web browsers are Pale Moon and Surf.
You can download the latest version of Window Maker Live from Sourceforge
(https://sourceforge.net/projects/wmlive/files/wmlive-bookworm_0.96.0/ ). Read
the changelog (https://downloads.sourceforge.net/project/wmlive/wmlive-book-
worm_0.96.0/ChangeLog) and the What’s New documents (https://downloads.source-
forge.net/project/wmlive/wmlive-bookworm_0.96.0/WHATS_NEW) to find out more.

KDE Plasma 6 Sets Release Date


February 28, 2024. Mark your calendars because that’s the official date the KDE
team has set for the release of KDE Plasma 6.0.
Get the latest news According to the official KDE release schedule (https://community.kde.org/Sched-
ules/February_2024_MegaRelease), February 21 is the private tarball release, and
in your inbox every February 28 is the official public release, which includes KDE Gear 24.02.0, KDE
week Plasma 6.0, and KDE Frameworks 6.0.
Some of the work that has been completed includes custom ordering for
Subscribe FREE KRunner search results, printers KCM rewritten in QML, double-click by default,
to Linux Update tap-to-click by defaults, and icons throughout Plasma now come from system-
bit.ly/Linux-Update wide icon theme.

10 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


NEWS
Linux News

In addition, you’ll find support for automatic bug reporting in DrKonqi, autostart
KCM shows details about entries, no more chunky page footers in System Settings,
completely reorganized sidebar in System Settings, smoother mouse wheel scroll-
ing in apps based on QtQuick, and the floating panel will be now the default.
The biggest change, however, is that Wayland will be the default graphics
stack (over X.Org). One nice touch that has been added is that distributions can
now customize the first page in the Welcome Center.
Of course, there will also be the usual bug fixes and security updates.
There will also be a new task switcher for KDE Plasma, making it much easier for
users to multitask.
You can read all about the upcoming changes to KDE Plasma in Nate Graham’s
official blog (https://pointieststick.com/2023/05/11/plasma-6-better-defaults/ ).

Fedora Project and Slimbook Release the


New Fedora Slimbook
The new Fedora Slimbook is a sleek ultrabook that easily looks like it could have
slipped out of the Apple factory.
It’s a 3.3-pound notebook with a 16" 2560 X 1600 px high-res display (with a 90Hz
refresh rate powered by an NVIDIA RTX 3050 Ti GPU, an 82Wh battery, an Intel
Core 17-12700H CPU (with 14 cores and 20 threads), and the Gnome desktop envi-
ronment to make interacting with the hardware as user friendly as it gets.
As for the ports, you’ll find 1 USB-C Thunderbolt, 1 USB-C with DisplayPort, 1
USB-A 3.0, 1 HDMI 2.0, 1 AC, 1 Kensington Lock, 1 SD card reader, and a 3.5mm
combo mic/headphone jack.
You can configure RAM from 16GB to 64GB, internal storage from 500GB to 2TB
NVMe (and even secondary storage from 500GB to 2TB), and add RAID 0 or 1.
The base price of the Fedora Slimbook starts at €€€1,799. A fully configured ver-
sion can run up to €€€3,156.
Assembly time is one week and the devices are available for purchase now. Learn
more on the product website (https://slimbook.es/en/store/slimbook-executive/fe-
dora-slimbook-16-comprar).

Gnome Developers in Discussion to End


Support for X.Org
In this merge request (https://gitlab.gnome.org/GNOME/gnome-session/-/merge_
requests/98 ) the Gnome development team stated, “This is the first step towards
deprecating the X11 session; the gnome-xorg.desktop file is removed, but the X11
functionality is still there so you can restore the X11 session by installing the file in
the appropriate place on your own.”
That was then followed by the suggestion to remove the rest of the X11 session
code for the next cycle, which could then be followed by removing the X11 code
altogether.
This makes perfect sense, because X11 has been getting less and less testing
over the past few years and Wayland development continues to go full steam. On
top of that, Wayland is far more secure than X11 and offers features better suited
for modern displays and interfaces.
Of course, not every developer is keen on dropping X11 so soon. One commenter
in the thread mentioned how Wayland isn’t ready for graphics professionals (be-
cause it has yet to implement basic color management).
However, the Gnome team isn’t pulling the plug on X11 just yet. This proposal
only removes one 8-line text file that can be added back if a user wants to continue
with X11.
Removing support for X11 is an inevitability because Wayland is the future of
the Linux desktop. Chances are good that X11 will be fully deprecated by the end
of 2024.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 11


NEWS
Kernel News

Zack’s Kernel News


Avoiding Bloat in the “If you have a hard time figuring out
Kernel That Does what the eventfs entries are, maybe you
Everything should just have made ‘iterate_shared’
Sometimes prospective features are use- show them, and then you could use fancy
ful, and sometimes they’re not. Some- tools like ‘ls’ to see what the heck is up in
times they’re useful, but only to a very that directory?”
specific set of users that somehow strad- Steven replied that he hadn’t actually
dle the divide between first- and second- copied the code from the /proc filesys-
class citizens. These special users are the tem, though he acknowledged there
kernel developers themselves. were similarities. He said, “I tried to look
Steven Rostedt recently posted a patch at how /proc does things and I couldn’t
that would generate a permanent file in really use it as easily, because proc uses
the TraceFS filesystem. The file would its own set of ‘proc_ops’, and I had some
identify the directory entries (dentries) different requirements.”
Chronicler Zack Brown reports and their reference counts, for dynamic But in terms of Linus’s suggestion of a
file creation in the EventFS filesystem. simpler way to see the EventFS entries,
on the latest news, views, Steven pointed to a recent debugging Steven replied, “I was more interested in
dilemmas, and developments session where part of the debugging pro- what did not exist than what existed. I
within the Linux kernel cess involved creating such a file. He felt wanted to make sure that things were
it would be useful for future debugging cleaned up properly. One of my tests that
community. to have such a file available by default. I used was to do a: find /sys/kernel/trac-
By Zack Brown There followed a fascinating exchange ing/events, and then run my ring_buffer
between Linus Torvalds and Steven. memory size stress test (that keeps in-
Linus’s take on the situation was that creasing the size of the ring buffer to
“this is neither a bug-fix, nor does it make sure it fails safely when it runs out
seem to make any sense at all in the of memory). Then I check to make sure
main tree. This really feels like a ‘tempo- all the unused dentries and inodes were
rary debug patch for tracing reclaimed nicely, as they hang around
developers’.” until a reclaim is made.”
Steven replied that it did seem to be gen- However, Steven saw which way the
erally useful, because “it can be used to wind was blowing and didn’t intend to
see what’s happening internally.” He said get blood on his sword over a potentially
he’d wrap the feature in an #ifdef state- useful debugging patch. He asked, “Are
ment, so that developers would be able to you entirely against this file, or is it fine
use it and other similar resources in the fu- if it’s just wrapped around an
ture for easy access to filesystem internals. CONFIG_EVENTFS_DEBUG?”
But Linus reiterated that this was not a Linus explained:
feature he wanted in the kernel. He said: “I think [it’s] extra code that we’d carry
“Honestly, you copied the pattern from around – probably for much too long –
the /proc filesystem. with absolutely _zero_ indication that it’s
“The /proc filesystem is widely used actually worth it.
and has never had this kind of random “Not worth asking people about, but
debugging code in mainline. also not worth carrying around.
Author “Seriously, that eventfs_file thing is not “You worry about bugs in it now, be-
The Linux kernel mailing list comprises worthy of this kind of special debug code. cause it’s new code. That’s normal. That
the core of Linux development activities. “That debug code seems to be ap- doesn’t make your debug interface worth
Traffic volumes are immense, often proaching the same order of size as all any kind of future.
reaching 10,000 messages in a week, and the code eventfs_file code itself is. “Keep it around as a private patch.
keeping up to date with the entire scope “There’s a point where this kind of Send it out to people if there are actual
of development is a virtually impossible stuff just becomes ridiculous. At least issues that might indicate this debug
task for one person. One of the few brave wait until there’s a *reason* to debug a support would help. And if it has
souls to take on this task is Zack Brown. simple linked list of objects. shown itself to be useful several times,

12 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


NEWS
Kernel News

at that point you have an argument for year or two after the new code is added.
the code. But after a few years, we could delete it
“As it is, right now I look at that code too.” And in a subsequent email, he also
and I see extra BS that we’ll carry around said, “I’ll keep the code around locally,
forever that helps *zero* users, and I find and if vfs ever changes and breaks this
it very questionable whether it would code where this file helps in solving it,
help you. I’ll then do another pull request to put
“And if you really think that people this file upstream ;-).”
need to know what the events exist in And the thread ended there.
eventfs, then dammit, make ‘readdir()‘ This was a short debate and probably
see them. Not some stupid specialty fairly low-cost, because it didn’t repre-
debug interface. That’s what filesystems sent a huge amount of effort on Ste-
*have* readdir for.” ven’s part – he simply packaged up
But Linus replied to himself a couple some debugging code that had recently
of hours later, with a slightly different proven useful. So the rejection from
take. He said: Linus didn’t cost Steven very much. But
“Alternatively, if you have noticed that it’s very interesting to me personally the
it’s just a pain to not be able to see the way Linus balances the needs of devel-
data, instead of introducing this com- opers against the needs of the rest of us.
pletely separate and illogical debug inter- The Linux kernel project is completely
face, just say ‘ok, it was a mistake, let’s dependent on the contributions of de-
go back to just keeping things in dentries velopers like Steven, while the rest of
since we can _see_ those’. us – aside from possibly submitting a
“Put another way: this is all self-in- bug report once in awhile – are simply
flicted damage, and you seem to argue the beneficiaries. But as far as Linus is
for this debug interface purely on ‘I can’t concerned, Steven’s bit of debugging
see what’s going on any more, the old code, benefitting only developers, had
model was really nice because you could no place in the kernel, even as a rela-
*see* the events’. tively temporary aid until the feature it
“To me, if that’s really a major issue, helped had stabilized. It’s a fascinating
that just says ‘ok, this eventfs abstraction balancing act on Linus’s part, intended
was mis-designed, and hid data that the to keep the Linux kernel – an operating
main developer actually wants’. system that supports virtually every
“We don’t add new debug interfaces piece of computer hardware on the
just because you screwed up the design. planet – from becoming bloated with
Fix it.” extra code that might make it more dif-
Steven remarked with a wry smile, ficult to maintain.
“The entire tracing infrastructure was
created because of the ‘I can’t see what’s Particularly Odd
going on’ ;-) Not everyone is as skilled Occurrences of Stardust
with printk as you.” Recently, a Spectre variant 1 (V1) vul-
He also explained the historical rea- nerability may or may not have appeared
soning behind the current design, say- in the Linux kernel. Spectre V1 is a bi-
ing, “The old eventfs model was too zarre vulnerability that takes advantage
costly because of the memory foot- of CPU optimizations that make a rea-
print, which was the entire objective of sonable guess at the result of condition-
this code patch. The BPF [Berkeley als, so it can begin to execute code along
Packet Filter] folks told me they looked the path that’s most likely to be taken
into use a tracing instance but said it after the conditional is performed. If it
added too much memory overhead to guesses right, it keeps those calcula-
do so. That’s when I noticed that the tions; otherwise, it abandons them and
copy of the entire events directory that starts again along the proper path. And
an instance has was the culprit, and because its guess is generally pretty
needed to be fixed.” good, the CPU tends to save time this
So Steven felt the “design” Linus had way and increases overall performance.
complained about was correct and didn’t The problem is that for those wrong
need to be “fixed.” But he added, “I get guesses, the unneeded calculations
your point. I will agree that this interface aren’t really abandoned at all – they still
will likely be mostly useful for the first leave traces of data behind them (e.g.,

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 13


NEWS
Kernel News

data such as passwords), which mali- with Alexei that the attacker would not anyone propose to mispredict a branch
cious programs can read and use. be able to access the data that worried this way.” But he added that this proba-
When Spectre V1 was discovered, the Luis, he felt that the attacker would in- bly “means that no known speculation
Linux developers patched the kernel to deed have access to parts of those attack was crafted. I suspect that’s a
prevent those data traces from lingering memory addresses. strong sign that the above approach is
or being created in the first place. How- The reason you want to keep Linux indeed a theory and it doesn’t work in
ever, to maintain security, it’s important kernel memory addresses out of the practice.”
that new kernel features and other hands of an attacker is because the ad- Alexei concluded sternly, “So I will in-
patches avoid re-exposing those things. dresses let the attacker make guesses sist on seeing a full working exploit be-
Luis Gerhorst recently identified a about the overall layout of the kernel in fore doing anything else here. It’s good
patch that had previously gone into the system memory. The kernel relies on to discuss this cpu speculation concerns,
Linux kernel as potentially re-exposing Kernel Address Space Layout Random- but we have to stay practical. Even re-
the Spectre V1 vulnerability under cer- ization (KASLR) to prevent such access moving bpf from the picture there is so
tain circumstances. The patch had al- for this reason. This feature loads the much code in the network core that
lowed the kernel to compare the pointers kernel into a random place in system checks packet boundaries. One can find
used to access packets of data sent memory, specifically to prevent attack- plenty of cases of ‘read past skb->end’
across a network – and specifically to ers from knowing where a given part of under speculation and I’m arguing none
allow the size of the data packets to be the system is located, in order to target of them are exploitable.”
variable. According to Luis, it was the that part for an attack. Daniel’s point Luis posted code that leaked some
variability of the packet size that let was that by exposing even a portion of otherwise inaccessible data via Spectre
Spectre V1 rear its head again. those kernel addresses, the kernel V1. But he also acknowledged to Alexei,
If the packets had a fixed size, then would allow the attacker to mitigate the “However, you are right in that there
the kernel could simply check the effect of KASLR protections. So the vul- does not appear to be anything ex-
bounds. But with the variable packet nerability wouldn’t give the attacker di- tremely useful behind skb->data_end,
size, hostile code could load more data rect access to sensitive kernel data like destructor_arg is NULL in my setup but I
beyond the packet itself, which would passwords, but it would help the at- have also not found any code putting a
then be exposed when the kernel ran its tacker identify other potential exploits static pointer there. Therefore if it stays
comparison and the CPU optimized that that they might attempt. like this and we are sure the allocator in-
conditional. Alexei, however, was still not con- troduces sufficient randomness to make
But it’s not as clear as all that! vinced. He felt that the attacker would OOB reads useless, the original patches
Alexei Starovoitov looked over Luis’s still not be able to identify the data it can stay. If you decide to do this I will be
argument and concluded that, in fact, was accessing. Just as the attacker happy to craft a patch that documents
there was no way for an attacker to ac- couldn’t access passwords, the attacker that the respective structs should be con-
tually get access to useful data in this would not be able to access those kernel sidered ‘public’ under Spectre v1 to
particular situation. The attacker, Alexei addresses. make sure nobody puts anything sensi-
said, could indeed expose sensitive However, Luis was not convinced by tive there.”
data. However, because they would not Alexei being unconvinced. He felt that The discussion ended there.
have control over the various pointers he had identified aspects of Spectre V1’s It’s still unclear whether an actual use-
involved, they would not be able to ac- basic vulnerability – things that could in- ful exploit for either Luis’s or Daniel’s
tually read that data in such a way as to deed be exploited. Also, in terms of cases exists. But it’s also true that Alex-
know what data they were reading. Ex- Alexei’s response to Daniel’s specific ei’s approach to this problem seems to
posing the data was not enough! As case, Luis countered, “It is true that this follow Linus Torvalds’s general principle
Alexei put it, “the attack cannot be re- is not easily possible using the method that security fixes must address actual
peated for the same location. The at- most exploits use, at least to my knowl- exploits, rather than people simply im-
tacker can read one bit 8 times in a row edge (i.e., accessing the same address plementing speculative protections that
and all of them will be from different lo- from another core). However, it is still might not actually be needed.
cations in the memory. Same as read- possible to evict the cacheline with skb- Security is an inherently nightmarish
ing 8 random bits from 8 random loca- >data/data_end from the cache in be- topic in software development, in which
tions. Hence I don’t think this revert is tween the loads. […] For a CPU with strange dreamscapes continually seem to
necessary. I don’t believe you can craft 64KiB of per-core L1 cache all 64-byte turn the simplest truths on their heads.
an actual exploit.” cachelines can be evicted by iterating Whatever the most obvious assumption
Daniel Borkmann agreed with Alexei, over a 64KiB array using 64-byte incre- might be, it also might be exactly where
but he felt there could be additional se- ments, that’s only 1k iterations.” a sudden vulnerability will be revealed.
curity vulnerabilities to take into ac- Luis also posted some actual assem- Many strange features and constraints in
count. Specifically beyond the end of a bly code that he felt would leak data in the Linux kernel boil down to the need
given networking data packet, the ker- this case. to avoid particular vulnerabilities. And
nel stored a data structure that con- Alexei acknowledged that “I have to the answer to many of the oddest ques-
tained memory addresses used by the agree that the above approach sounds tions in kernel development is often,
kernel. And although Daniel agreed plausible in theory and I’ve never seen simply, security. Q Q Q

14 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


COVER STORY
Science on a Crypto Rig

Scientific computing with a crypto mining rig

Second Chance
Lots of retired Bitcoin mining computers are showing up on the second-hand
market for cheap. Could these once-impressive machines have a second life
in scientific computing or machine learning?

By Steffen Möller, Christian Dreihsig, Sebastian Hilgenhof, Malte Willert

D
espite the steady increase in computing power from a 1:1 extension of the slot or via an x1 plug-in card that simply
one generation to the next, computers are rarely fast transmits the PCIe signal via an inexpensive USB 3 cable.
enough for their users. Over the years, programmers The PCIe bus, which has been around since 2003, can play
and PC vendors have found ways to speed them up. If host to a number of components, from the WLAN board to the
you know exactly how a computer will be used, you can design graphics card. The speed of the PCIe bus has doubled with
it to maximize performance and minimize cost. each new version of the standard; the current version is 4.0. If
Crypto rigs are created with only one task in mind: to per- you take a look at a motherboard, it is clear that the slots have
form the arcane mathematical computations associated with different widths, which means that different numbers of PCIe
crypto mining. The crypto gold rush has led to a rapid evolu- channels can connect to the card – from x1 (one channel) up to
tion of the technology – a mining unit that was competitive a x16 with 16 times more throughput. (You can also install an x1
few years ago might already be obsolete. For instance, a few card in an x16 slot and vice versa.) The slots are compatible
years ago, mining rigs made extensive use of Graphics Process- with each other up to PCIe 4.0; in other words, systems de-
ing Units (GPUs); in more recent years, Field Programmable signed for different versions can communicate with each other
Gate Arrays (FPGAs) and then Application-Specific Integrated via the standard of the lower version.
Circuits (ASICs) have replaced graphics cards. Crypto mining
has also experienced a bit of a downturn recently due to envi- Power Supply
ronmental fears and instability of the larger economy. The power supply plays an important role in systems that need
As a result of these and other factors, mining rigs are increas- to run continuously. The requirements are very high due to the
ingly ending up on the second-hand market, where you can possibility that several graphics cards could experience peaks
buy them relatively cheaply even if you are not a professional simultaneously (after all, the tasks run in parallel). In just a
user. Could one of these rigs serve another role? few months, you might discover that the electricity bill exceeds
Mining rigs make extensive use of GPUs, and GPUs are well the initial cost of the rig.
suited to scientific computing and machine learning. Several Mining rigs often use second-hand server power supplies to
GPUs in a single computer will boost the potential performance reduce costs. A server power supply is powerful and very energy
many times over for a computation-intensive activity, such as efficient: Most achieve the 80 Plus Platinum efficiency rating
solving a large mathematical problem. (more than 94 percent efficiency at 50 percent load) and are
We decided to buy a used crypto mining rig and see how it often unbeatably cheap to run. However, this kind of power sup-
compares to a higher-end computation-focused commercial ply only gives you 12V and is therefore not suitable for the ATX-
system. This article summarizes our findings. First, however, based motherboards found on many common PCs [2] without
we’ll provide a little background on what you do (and don’t) changes. It is easy to understand why the small PicoPSU power
get when you invest in a used mining rig. converter board [3] has become popular, because it also sup-
ports other voltages. Replicas of the PicoPSU are also available
PCIe Versions from various Chinese manufacturers. These boards are very pop-
Most used mining rigs power regular graphics cards via Periph- ular, especially for home theater PCs or similar devices.
eral Component Interconnect Express (PCIe) [1]. If the board is PicoPSUs and their replicas come with some pitfalls that you
large enough, the rigs can be plugged in right next to each need to watch out for. They mainly provide power on the 12V
other. There are also variants where the motherboard does not rail, which they simply loop through from the power supply. If
offer the slots directly but outsources them to a PCIe back- the consumer requires other voltages, such as 3.3V or 5V (say,
plane. Depending on the version of the motherboard, up to 18 for SSDs), the power supply could fall short. A look at the data-
cards can be addressed. They then no longer fit into a case but sheet reveals a current of 6 amps – not really much, consider-
are connected via extension cables (risers) – either actually as ing that a PCIe card is allowed to draw 3 amps from the 3.3V

16 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


rail according to the standard. Since the motherboard also budget for other more critical components. The mining rig we
needs some power itself, this is actually only enough for a sin- used in our test had a small dual-core Intel CPU in a ball grid
gle PCIe card. array (BGA) package, which means it was soldered and could
GPUs don’t cause problems because they convert the voltage only be replaced along with the motherboard. GPU mining rigs
from the 12V line themselves and cause virtu-
ally no load on the 3.3V rail. But other cards
can quickly create a power squeeze. M.2 solid-
state drives (SSDs), for example, are only con-
nected to the 3.3V rail (M.2 only has 3.3V pins)
and can consume up to 10W under load – at
3.3V, this is half of the permissible power con-
sumption at 3 amps. This just goes to show
how quickly you can provoke a load-dependent
failure. SATA devices are also allowed to draw
up to 4.5 amps per rail.
If you are buying new components, choose a
motherboard and power supply that match each
other. But our focus is on budget used hard-
ware. The combination of inexpensive used
server power supplies and a PicoPSU is often
both cheap and fit for purpose.
If you are buying a used rig, keep in mind
what the hardware was once designed for.
Server hardware, for example, is not optimized
for quiet operation. In my case, both the fans of
the original mining rig and the fans of the re-
placement case were so loud that they were an-
noying even when I put them in a different
room and kept the door to the room closed. If
you think you can solve the noise problem by
installing the graphics cards into a normal PC
case, think again. In this case, the graphics
cards are passively cooled and dependent on
the airflow in the case.

CPU and Chipset


Mining hardware is usually radically cost-opti-
mized. The optimization typically starts with the
CPU. The CPU is not used for the actual mining,
so mining rigs often use an inexpensive, power- Figure 1: The spartan test computer: Cost-optimization was the
saving processor like a Celeron and save their top priority.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 17


COVER STORY
Science on a Crypto Rig

usually have no more than 4GB RAM. Better graphics cards offer PCI switches are also available on the server boards, but the
the possibility to interconnect – NVIDIA calls this Scalable Link total number of available PCI channels is higher due to the use
Interface (SLI) or NVLink; CrossFireX is the AMD equivalent. of two processors. Currently, AMD’s PCIe 4.0 standard offers a
This interconnect feature allows multiple cards to act as a single technical advantage on both desktops and servers with twice
large board, reducing communication on the PCIe bus. the transfer speed per channel and a higher number of chan-
Cost optimization is also reflected in the case (Figure 1): The nels provided by the processor.
test rig case is not much more than a galvanized steel box with
a few cutouts for fans (Figure 2). Preparations for cable rout- The Test Candidate
ing, for example, were not needed because everything was We purchased a mining rig with a backplane and separate
plugged into a backplane. If you are thinking about a potential motherboard at auction for EUR750. The system did not work
hardware conversion, you should get used to the idea of using reliably at first. The power supply worked, but it was too loud
a drill, pliers, and a little creativity to work around the limita- and smelled unhealthy. The eight installed NVIDIA P106-090
tions of the case. mining cards from 2018 with PCIe 1.1 x4 were OK. We treated
The processors already provide the PCIe channels. The them to a new case, memory, motherboard, processor and, to
motherboard distributes these channels to the slots – either di- be on the safe side, a new power supply for another EUR350.
rectly or via a PCI switch in professional systems. This design We wanted to compare the performance of this used mining
means that only a limited number of PCIe channels are avail- rig with a high-end professional system. The professional hard-
able. A single card can access the full x16 bandwidth of the ware we chose for comparison was a 2020 system with eight
PCIe channels. However, if there is another card in the slot next NVIDIA A100 cards and PCIe 4.0 x16. The cost for this profes-
to it, each card receives a maximum of x8, and this can drop to sional system was more than EUR75,000, which was 100 times
x1 as you add more cards. more expensive than the mining rig we bought at auction.
Motherboard descriptions often prove to be anything from GPU-focused systems are optimized for computation-inten-
superficial to misleading when they refer to the physical width sive operations, so we wanted to stay with that basic scenario
of the slot instead of the number of channels feeding the card. in our tests. We tested two different use cases:

Figure 2: The case is little more than a galvanized steel box with some fan cutouts and everything plugged
into a backplane.

18 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


COVER STORY
Science on a Crypto Rig

• Scientific computing using the BOINC crowdsource comput- PyTorch


ing platform [4] PyTorch is an open source machine learning framework. We
• Machine learning with the PyTorch deep learning framework put together a manageable script that uses a neural network to
[5] and a well-known test dataset to teach the system to dis- classify images on a varying number of graphics cards (or just
tinguish between dog and cat images on the CPU). To do this, the images must be transported to the
A cheap used mining rig that sells for one percent of the cost graphics cards and, if the results are distributed over several
of an advanced computer system would be a big advantage, cards, they also need to be merged again at the end. During
but we were realistic. We had no illusions a EUR750 mining training, the models also need to be updated on all cards.
rig would outperform the high-end commercial system in an The CPU is not something you can do without in machine
absolute sense. We were more curious about whether it was learning projects with GPU support. On the contrary, it actu-
competitive in delivering computing power per cost. In other ally becomes more and more important as the number of
words, if an option delivers one tenth of the computing power compute cores increases. It first prepares the data for the
but it comes at only on one hundredth of the cost, there are GPU and then summarizes the GPU results. If you distribute
scenarios where it could be a viable alternative. the workload over many GPUs, the processor can definitely
We were also aware that the different components of the become the bottleneck for which the graphics units will
design would affect performance in different ways. The two have to wait. How much the communication between GPU
systems didn’t just have two different GPUs. The difference and CPU can be reduced depends on the application. If the
between the PCIe 1.1 x4 bus and the PCIe 4.0 x16 bus also data can be represented as a matrix and the application is
seemed significant, as well as the differences in the CPUs. based on operations on matrices or between them, GPUs are
For a few of the tests, we experimented with putting the hard to beat.
GPUs from the mining rig into the newer system to isolate We assumed for our study that the number of cards used
the GPU as a variable. does not affect the quality of the predictions. We actually did
not pay any further attention to the quality of the prediction, as
BOINC Benchmarks it can depend on a variety of factors, such as the quality of the
We picked out three BOINC-based crowdsource projects that training dataset or the size of the batches. We exclusively
support GPUs. Einstein@Home [6] uses data from the LIGO looked at the number of images that could be trained or classi-
gravitational-wave detectors, the MeerKAT radio telescope, the fied (evaluated) every second with the given hardware.
Fermi Gamma-ray Space Telescope, as well as archival data from
the Arecibo radio telescope to look for spinning neutron stars Analysis of the Results
(often called pulsars). The professional system with eight A100 The P106-090s only support PCIe 1.1 with x4 channels for com-
cards needed 300 seconds in this test. The mining rig took 2,000 munication. In our mining rig with x1 risers, they were there-
seconds per work unit, which is more than six times as long, but fore only connected with PCIe 1.1 x1. In a PCIe 4.0 x16 envi-
again, the professional system was 100 times more expensive. ronment, the same cards can be addressed with four times the
Was the superior performance of the professional computer throughput. The fact that the BOINC computing times hardly
due to the GPU or the faster processor with faster and wider changed when switching from the PCIe 1.1 x1 to PCI4.0 x16 on
PCIe bus? To find out, we installed the P106-090 cards from the the faster system reflected the fact that the projects we selected
mining rig into the professional system. Despite the faster pro- use the GPU almost exclusively. In the style typical of BOINC,
cessor and the 4x instead of 1x PCIe channels, the P106-090 these manageable computational jobs are designed to be com-
cards ran only one percent faster when installed on the faster puted independently of each other – they do not need to be
system. Einstein@Home allows multiple work units to share a synchronized with the computations on the neighboring GPU.
GPU. We would have expected that processing two work units To our astonishment, the A100 cards could hardly exploit
at once would lead to a performance advantage, but calculating their advantages in the BOINC test. Even the speed-up factor of
two jobs on one card also doubled the computing time, so it 10 achieved in a prime number search seems low compared to
did not yield an advantage. a factor of 100 if you compare the hardware price.
The prime number search with PrimeGrid [7] requires virtu- Although the mining rig might have been competitive in
ally no CPU interaction with the cards (less than two percent computations per euro, the P106-090 (75W per card) is clearly
CPU load). The P106-090s of our test system required between inferior to the maximum 250W per professional card in terms
916 and 925 seconds (CudaPPSsieve) and about 4,500 seconds of performance per consumed watt – after all, you would have
(OCL_cuda_AP27). The A100s in the professional rig com- to spend between 475 and 750W for the same computing per-
pleted the task in about one tenth of the time in each case. formance with data-parallel requirements. However, in com-
For the third BOINC test, we selected a benchmark program mercial use, it is important to note that the real cost could be in
for the Folding@home biomedical project [8] and launched it the longer wait time. Things you can compute in an hour on a
simultaneously on several GPUs. The benchmark measures large card take a whole workday on a small one.
how many nanoseconds of a process in nature the computer The machine learning test with PyTorch was different. The
can model within one day. With single precision, the mining small cards of the mining rig completely failed to process
rig’s P106 GPUs managed 59 ns/d when placed in the profes- larger batches, which specifically benefit from parallelization.
sional system, whereas the A100 achieved 259 ns/d. With Dou- The weak bus connection in the rig and slow communica-
ble Precision (not supported in the hardware on the P106) it tion due to PCIe 1.1 ate up the advantage of parallelization
was 159 ns/d on the A100, while the P106 achieved just 3 ns/d. in the test.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 19


COVER STORY
Science on a Crypto Rig

There were no big surprises in the test. Although the training Conclusions
phase took longer than classifying the taught model, the rela- For data-parallel requirements like BOINC, the eight cards in
tive times of the different hardware configurations matched. the mining rig roughly match the performance of a single pro
We also tried neural networks with different sizes; this had an card, but cost less, even taking into account the higher power
effect on the maximum batch size, but with roughly equal rela- consumption. For machine learning, however, a good, modern
tive speeds of the systems to each other. This is why we are graphics card with plenty of memory is preferable. With the
only showing the figures for training with the smallest model support of the 16-bit floating-point numbers frequently used in
in Figure 3. machine learning compared to integer operations with 8- or
even only 4-bit width, the newer cards
extend their lead.
Cloud services are an interesting foot-
note for this study. Although not every
hosting service provider offers special
GPU computers yet, you can already find
offers with 8 and 16 cards. Prices vary
depending on the number and type of
GPUs selected. In some scenarios, you
might come up with a configuration
where the mining rig serves as a local in-
stallation that is useful for preparing
projects to run on faster systems in the
cloud, as long as you are allowed to
store the data in the cloud and the laten-
cies for data transfer are compatible with
the project goals. Q Q Q

Info
The authors would like to express their
special thanks to the HPC specialist MEG-
WARE GmbH [9]. The company provided
access to its test computers and installed
the P106-090 from the test rig into one of
its systems for direct comparison.

Info
[1] PCIe: https://en.wikipedia.org/wiki/
PCI_Express
[2] ATX: https://en.wikipedia.org/wiki/ATX
[3] PicoPSU: https://www.onlogic.com/
technology/glossary/picopsu/
[4] BOINC: https://boinc.berkeley.edu/
[5] PyTorch: https://pytorch.org/
[6] Einstein@Home:
https://einsteinathome.org
[7] PrimeGrid:
https://www.primegrid.com/
[8] Folding@home:
https://foldingathome.org
Figure 3: The benchmark results for the machine learning test for image [9] MEGWARE GmbH:
classification. https://www.megware.com/en/

QQQ

20 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


A tour of some important data science techniques

Method in the
Madness
Data science is all about gaining insights from mountains of data.
We tour some important tools for the trade. By Tom Alby

D
ata is the new oil, and data science is the new refin- lending has worked so far and what data has been collected
ery. Increasing volumes of data are being collected, by in this field – as well as whether that data is actually avail-
websites, retail chains, and heavy industry, and that able – with a view to data protection requirements. In addi-
data is available to data scientists. Their task is to gain tion, data scientists need to be able to communicate their
new insights from this data while automating processes and findings. Storytelling is more useful than presenting infinite
helping people make decisions [1]. The details for how they rows of numbers, because the audience is likely to be made
coax real, usable knowledge from these mountains of data can up of non-mathematicians. The need to clearly explain the
vary greatly depending on the business and the nature of the findings frequently presents a challenge for less extroverted
information. But many of the mathematical tools they use are data scientists.
quite independent of the data type. This article introduces you
to some of the methods data scientists use to squeeze insights Preparing the Data
from a sea of numbers. What sounds simple in theory often requires time-consuming
data cleaning and transformation. Data is not always available
More than Just Modeling in the way you need it. For example, many algorithms require
The term data scientist evokes associations with math nerds, numerical data to be extracted from non-numerical data.
but data science consists of far more than building and opti- To separate the data, the data scientist forms categories that
mizing models. First and foremost, it involves understanding a can be divided using either numerical distances or dummy
problem and its context. variables, where each occurrence of a characteristic (such as
For example, imagine a bank wants to use an algorithm to male, female, and nonbinary) becomes a separate variable. As
predict the probability that a borrower will be able to repay a rule, one variable can be omitted. For example, in this data
a loan. A data scientist will first want to understand how set, someone can only be male if they are neither female nor

22 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


COVER STORY
Data Science Methods

nonbinary. However, erroneous user input often results in data the different values. Visualizing these variances typically cre-
points that could bump an algorithm off track. These data ates a dent in the curve – the elbow, where you can read off the
points need to be identified and cleaned up. optimal value for k.
The data scientist also looks for variables that are genuinely
relevant to the model. This is where the information gathered Association Rules
during the understanding phase comes into play. In an explor- Association rules, as used by stores to offer similar products,
atory data analysis, often in a Jupyter Notebook or similar, the are another popular example of unsupervised learning. “Cus-
data scientist generates and documents the findings in order to tomers who purchased X often also look at Y” would be a typi-
share them with colleagues (or at least ensure that the findings cal application of association rules. Working with association
are repeatable). rules usually involves looking at items (e.g., a product in a
store) in the context of transactions, which can also be under-
Choosing a Suitable Model stood as shopping carts or cash register receipts. The Apriori
First and foremost, the choice of algorithm depends on the algorithm is a popular approach because it requires less com-
task. If data capable of training an algorithm is available, putation. Apriori ignores rare items and also the transactions in
data scientists refer to this scenario as supervised learning. which they appear, which means that it has a far smaller data
For instance, if you have access to historical data on loan volume to work through.
defaults, you could use it to predict whether future borrow- Rules with different characteristic values are created from the
ers will repay their loans. The variable used for training is remaining transactions, as a function of the parameters: Sup-
often referred to as the target variable – in this example, this port shows how often a shopping cart occurs in comparison to
is simply whether or not a loan has been repaid. Other ex- all shopping carts (other items can also exist in the shopping
amples would be classifications, whether or not a birthmark cart). Confidence tells you how often an item appears when an-
is indicative of skin cancer, or whether a customer is a other defined item is present. Lift indicates how much more
fraudster. frequently a combination occurs than the independent items.
Rules that have a high lift and at the same time appear fre-
Unsupervised Learning quently enough to be seen by users are of interest.
If data exists but does not contain target variables, then it is
often a matter of finding a pattern in the data, for example, to Supervised Learning
classify customers into segments. This type of machine learn- One of the simplest machine learning models is linear regres-
ing is known as unsupervised learning. One of the most popu- sion. Linear regression has been around since the 19th century
lar algorithms in unsupervised learning, judging from the num- and it is a little like the “Hello World” of machine learning. Fig-
ber of tutorials on the subject, is k-means. The k-means algo- ure 2 shows the occurrences and prices of used SLR cameras
rithm clusters the data (i.e., it breaks the data down into seg- for a specific camera model. The more occurrences, the less a
ments). Roughly described, this method first locates centroids used camera is likely to cost, as the data points also already in-
at the data points and then calculates the distances from the dicate. But how can you determine a fair price?
data points to these
centroids.
The data points closest to
each of the centroids give you
the first clusters. You then
compute the actual centers of
these clusters. The result is
the new distances of the indi-
vidual data points to the cen-
ter points. Based on this, the
clusters re-sort. This process
is repeated until the centers
stop changing.
Figure 1 shows this ap-
proach. The number of seg-
ments is determined by the
value k, which must be speci-
fied. This raises the question
of the appropriate number;
the answer is provided by the
elbow test. The elbow test in-
volves running k-means with
different cluster sizes and Figure 1: Visualization of a k-means clustering. First calculate the red center val-
showing how much variance ues for the black data points. Then, if necessary, redistribute the points to the
there is within clusters for resulting clusters as a function of the distance to the respective center.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 23


COVER STORY
Data Science Methods

dimensions to improve data separation – this


is known as the called kernel trick.
Naive Bayes is an algorithm that is not
based on distances. It is based on Thomas
Bayes’ theorem (published posthumously in
1763) and works with conditional probabili-
ties. The algorithm deals with the probability
of a case occurring (for example, default on a
loan), taking into account a certain condition
(the debtor has a negative credit report). How-
ever, things get a little more complicated, be-
cause there is typically not only one condition
but several; for example, whether the debtor
owns real estate, has an account with a bank,
and many other factors. These probabilities
are related to how often the cases occur over-
all (e.g., how often someone owns real es-
tate). By the way, the algorithm does not just
work with numbers but also with speech.
Some spam filters are also (but not exclu-
sively) based on the Naive Bayes algorithm.
In recent years, one algorithm in particular
has caused quite a stir, XG Boost. The XG
Boost algorithm comes from the family of
grading algorithms, hence its name, “Extreme
Figure 2: Visualization of a linear regression. The least squares Grading.” It derives from the decision tree, a
method yields the regression line shown here. machine learning algorithm popular for de-
cades due to its traceability. You first separates
In linear regression, you draw a line through the data points data points based on the criterion in which they differ the
and then measure the distances of the data points from the line most. By combining multiple trees (an ensemble) and strong
(residuals). The residuals are squared to get rid of negative and weak models (boosting), each tree learns from the mis-
signs and then totaled. The better the line fits between the data takes of the previous tree (grading).
points, the lower the sum of squared distances. The regression Reinforcement Learning is often seen as outside the cate-
is done when you find the line with the lowest sum. Using this gories of supervised and unsupervised because, in a sense,
regression line, it is now possible to read off which price is ap- it combines both approaches. Algorithms in this category
propriate for the used camera given a number of occurrences. find their own learning strategies and are then rewarded by
Support Vector Machines (SVMs) is another
technique that also works with distances and
lines. People started thinking about this algo-
rithm as early as in the 1930s and 1950s, but it
was not until the 1990s that SVMs made their
breakthrough. SVMs are often about classifica-
tion: Data points need to be broken down into
different classes. As with linear regression, a line
is drawn between the data points. But you do
not work with this line alone; instead there are
two auxiliary lines, the support vectors, which
you draw parallel to the first line. Now you need
to position the main line so that the supports are
as far away from it as possible without data
points crossing the supports. Figure 3 shows an
example of this process.
It is not always possible to set the supports so
that all data points are outside. In this case, you
calculate an error value (based on the distance to
the support line) for each incoming data point, Figure 3: Visualization of a SVM: The area around the lines
total the errors, and then look for the position of needs to be as wide as possible without enclosing data
the line that has the lowest error value. The spe- points. Then the point groups are separated in the best pos-
cial thing about SVMs is that you can add sible way.

24 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


COVER STORY
Data Science Methods

feedback. One example of rein-


forcement learning is Google’s
AlphaGo.

Performance
Measurement
With classification methods, in par-
ticular, you need a metric to judge
how well a model performs. To eval-
uate this, however, you need to know
more than how often a model returns
correct predictions.
If a model says no to every credit
decision, it could correctly predict
all credit defaults (true positives).
The true positive rate is also re-
ferred to as the sensitivity. Unfortu-
nately, the bank would then lose its
business model, since the model
would also prevent the good trans-
actions (false positives). If, on the
other hand, it allowed all applica- Figure 4: Visualization of a ROC AUC curve: The area under the curve is
tions, it would allow all incorrect an indicator of quality.
decisions (false negatives) in addi-
tion to the correct decisions (true negatives – also known as Another issue that many tutorials ignore: Although a model
the specificity). might work well, it might possibly discriminate against some
Ideally, a model will minimize both false positives and of the actual people that the data points represent. For exam-
false negatives: At both extremes, the bank goes broke – ei- ple, the inventor of Ruby on Rails, David Heinemeier Hansson,
ther because it no longer does any business at all or because had this experience [2] when the limit his wife was given for
too many loan defaults occur and can no longer be compen- her Apple Card credit card was 20 times lower than his own
sated for by loan income. The four values of false and true limit. Oddly enough, Mrs. Hansson had a better credit score
positives and negatives map to a confusion matrix. The con- than her husband and was taxed jointly with him. This sug-
fusion matrix reveals the number of cases an algorithm gen- gests that gender alone was the reason for giving her a lower
erates in each category of positives and negatives. This in- limit.
formation in turn provides a good overview of the perfor- In addition to just measuring the performance of an algo-
mance details, although a comparison with other model rithm, it is also important to test whether an algorithm discrim-
variants is difficult because the performance is not available inates. One way to test for discrimination is to enter exactly the
as a key figure. same data in a credit application, except for the gender or some
One way to acquire a key figure metric is to use ROC AUC other variable you are testing.
(Receiver Operating Characteristics Area Under the Curve).
The underlying approach of ROC AUC involves plotting the Conclusion
data points on two axes – one for sensitivity and the other for Data science is a vast topic that is constantly evolving as
specificity. The area under the resulting curve is then used as computers grow more powerful and new techniques emerge.
the key figure (Figure 4). If the value is near 0.5, the results are This article outlined some popular techniques that data sci-
as good as pure coincidence, and below 0.5 the results are entists use when they delve into data to find answers for
worse than random decisions. their questions. Q Q Q
The Precision Recall Curve offers another option. The term
precision, in this case, is the ratio of the true positives to the Info
sum of the true and false positives; the recall value is the same [1] Tom Alby, Data Science in Practice (Chapman & Hall, 2023):
as the sensitivity. https://www.routledge.com/Data-Science-in-Practice/Alby/p/
The statements made by all of these key performance indi- book/9781032505268
cators (KPIs) have their limitations, though, if you want to [2] Tweet on Apple Card.
know how a model will behave in the real world. For exam- https://twitter.com/dhh/status/1192540900393705474
ple, it is often useful to run a model against the previous
model (or manual processes, if applicable) in a split test. To Author
stay with the bank example: Did the model result in fewer Tom Alby is the author of several books, a lecturer on
loan defaults? On top of this, you also have to develop and everything data related at several universities, and has worked
maintain the model, which incurs costs. Does this overhead at companies such as Bertelsmann, Google, and bbdo. Today, he
pay off? is Chief Digital Transformation Officer with Allianz Trade.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 25


Getting started with the R data analysis language

Number Game
The R programming language is a universal tool for data
analysis and machine learning. By Rene Brunner

T
he R language is descriptive statistics,
one of the best mathematical set and
solutions for sta- matrix operations, and
tistical data anal- higher-order functions,
ysis. R is ideal for tasks such as those of the
such as data science and Map Reduce family. In
machine learning. R, addition, R supports
which was created by object-oriented pro-
Ross Ihaka and Robert gramming with classes,
Gentleman at the Uni- methods, inheritance,
versity of Auckland in and polymorphism.
1991, is a GNU project
that is similar to the S language, which was developed in the Installing R
1970s at Bell Labs. You can download R from the CRAN website. The CRAN site
R is an interpreted language. Input is either executed di- also has installation instructions for various Linux distribu-
rectly in the command-line interface or collected in scripts. tions. It is a good idea to also use an IDE. In this article, I will
The R language is open source and completely free. R, which use RStudio, which is the most popular IDE for R.
runs on Linux, Windows, and macOS, has a large and active RStudio is available in two formats [2]. RStudio Desktop is
community that is constantly creating new, customized a normal desktop application, and RStudio server runs as a
modules. remote web server that gives users access to RStudio via a
R was developed for statistics, and it comes with fast algo- web browser. I used RStudio Desktop for the examples in
rithms that let users analyze large datasets. There is a free and this article.
very well-integrated development environment named RStudio, When you launch RStudio Desktop after the install, you are
as well as an excellent help system that is available in many taken to a four-panel view (Figure 1). On the left is an editor,
languages. where you can create an R script, and a console that lets you
The R language works with a library system, which makes it enter queries and display the output directly. Top right, the IDE
easy to install extensions as prebuilt packages. It is also very shows you the environment variables and the history of exe-
easy to integrate R with other well-known software tools, for cuted commands. The visualizations (plots) are output at the
example Tableau, SQL, and MS Excel. All of the libraries are bottom right. This is also where you can add packages and ac-
available from a worldwide repository, the Comprehensive R cess the extensive help feature.
Archive Network (CRAN) [1]. The repository contains over
10,000 packages for R, as well as important updates and the First Commands
R source code. When you type a command at the command prompt and
The R language includes a variety of functions for manag- press Enter, RStudio immediately executes that command
ing data, creating and customizing data structures and types, and displays the results. Next to the first result, the IDE out-
and other tasks. R also comes with analysis functions, puts [1]; this stands for the first value in your result. Some

26 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


COVER STORY
R For Science

commands return more than one value, and the results can Table 1: Data Types in R
fill several lines. Type Designation Examples
To get started, it is a good idea to take a look at R’s data Logical values LOGICAL TRUE and FALSE
types and data structures. More advanced applications build on Integers INTEGER 1, 100, 101
this knowledge; if you skip over it, you might be frustrated
Floating-point numbers NUMERIC 5.1, 100.1
later. Plan some time for the learning curve. The basic data
Strings CHARACTER "a", "abc", "house"
types in R are summarized in Table 1. Table 2 summarizes
some R data structures.
To create an initial graph, you first need to define two vectors Table 2: Data Structures in R
x and y, as shown in the first two lines of Listing 1. The c Name Description
stands for concatenate, but you could also think of it as collect Vector The basic data structure in R. A vector consists of a
or combine. You then pass the variables x and y to the plot() certain number of components of the same data type.
function (last line of Listing 1), along with vectors; the col
List A list contains elements of different types, such as
parameter defines the color of the points in the output. Fig-
numbers, strings, vectors, matrices, or functions.
ure 2 shows the results.
Matrix Matrices do not form a separate object class in R
but consist of a vector with added dimensions. The
Installing Packages elements are arranged in a two-dimensional layout
Each R package is hosted on CRAN, where R itself is also and have rows and columns.
available. But you do not need to visit the website to
Data frame One of the most important data structures in R.
download an R package. Instead, you can install packages
This is a table in which each column contains
directly at the R command line. The first thing you will
values of a variable and each row contains a set
want to do is fetch a library for visualizations. To do this,
of values from each column.
call the install.packages("ggplot2") command in the com-
mand prompt console. The installation requires a working Array An array stores data in more than two dimensions.
An array with the dimensions (2, 3, 4) creates four
C compiler.
rectangular matrices, each with two rows and three
Setting up a package does not make its features available in
columns.
R yet – it just puts them on your storage medium. To use the
package, you need to call it in the R session with the
library("ggplot2") command. After restarting R, the library is RStudio has many built-in features that make working with
no longer active; you might need to re-enable it. Newcomers scripts easier. First, you can run a line of code automatically in
tend to overlook this step, which often leads to time-consum- a script by clicking the Run button or pressing Ctrl+Enter. R
ing troubleshooting. then executes the line of code in which the cursor is located. If
you highlight a complete section, R will execute all the high-
RStudio Scripts lighted code. Alternatively, you run the entire script by clicking
A script is a plain text file in which you store the R code. You the Source button.
can open a script file in RStudio via the File menu.
Data Analysis
A typical process in data
analysis involves a series of
phases. The primary step in
any data science project is
to gather the right data
from various internal and
external sources. In prac-
tice, this step is often un-
derestimated – in which
case problems arise with
data protection, security, or
technical access to
interfaces.
Data cleaning or data prep-
aration is a critical step in
data analysis. The data

Listing 1: First Chart


x <- c(1, 3, 5, 8, 12)

y <- c(1, 2, 2, 4, 6)

plot(x,y,col="red")
Figure 1: The main window of the RStudio IDE is divided into panels.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 27


COVER STORY
R For Science

collected from various sources might be disorganized, incomplete, Data Visualization


or incorrectly formatted. If the quality of the data is not good, the R has powerful graphics packages that help with data visual-
findings will not be of much use to you later on. Data preparation ization. These tools produce graphics in a variety of formats,
usually takes the most time in the data analysis process. which can also be inserted into documents of popular office
After cleaning up the data, you need to visualize the data for suites. The formats include bar charts, pie charts, histograms,
a better understanding. Visualization is usually followed by hy- kernel density charts, line charts, box plots, heat maps, and
pothesis testing. The objective is to identify patterns in the da- word clouds.
taset and find important potential features through statistical To quickly generate a couple of plots using the previously
analysis. installed ggplot2 package, first create two vectors of equal
After you draw insights from the data, a further step typically length. The first is a set of x-values; the second is a set of y-
follows: You will want to predict how the data will evolve in values. Next, square the values of the x vector to generate
the future. Prediction models are used for this purpose. Histori- the values for the y vector, and finally output the graph
cal data is divided into training and validation sets, and the (Listing 2).
model is trained with the training dataset. You then verify the The scatter plot is one of the chart types commonly used
trained model using the validation dataset and evaluate its ac- in data analysis; you can create a scatter plot using the
curacy and efficiency. plot(x, y) function. You can pass in other parameters, such
as main for the header input, xlab for the x-axis labels, and
ylab for the y-axis labels. Listing 3 uses a dataset supplied
by R from the US magazine Motor Trend in 1974, covering 10
aspects of 32 vehicle models, including number of cylinders,
vehicle weight, and gasoline consumption. Load the dataset
by typing:

data(mtcars

The command head(mtcars) then displays the first six lines.


Use the abline() function to add a regression line to the
graph (Figure 3). To do this, lm() first calculates the linear re-
gression between the range and the weight, which shows that
there is a relationship. This is a negative correlation: The
lighter a vehicle is, the farther it can travel on the same amount
of gasoline. The graph says nothing about the strength of the
relationship, but summary(fit) provides a variety of characteris-
tic values of the calculation. This includes a fairly high

Listing 2: Sample Graph


> x <- c(-1, -0.8, -0.6, -0.4, -0.2, 0, 0.2, 0.4, 0.6, 0.8, 1)
Figure 2: An initial, very simple chart in R. The coor- > y <- x^2
dinates of the data points were passed in as vectors. > qplot(x, y)

Listing 3: Vehicle Data Example


> plot(mtcars$wt, mtcars$mpg, main = "Scatter chart", xlab =
"Weight (wt)", ylab = "Miles per gallon (mpg)",

pch = 20, frame = FALSE)

> fit <- lm(mpg ~ wt, data=mtcars)

> abline(fit, col="red")

Listing 4: Box plots


> qplot(factor(cyl), mpg, data = mtcars, geom = "violin",
color = factor(cyl), fill = factor(cyl))

Listing 5: Data Cleanup


> colnames(mtcars)[colnames(mtcars) == 'cyl'] <- 'Zylinder'

> without.zeros <- na.omit(mtcars)


Figure 3: The regression line illustrates the relation- > without.duplicates <- unique( mtcars )
ship between the vehicle weight and range.

28 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


COVER STORY
R For Science

R-squared value, a statistical measure of how close the data line). If you do not want to overwrite the column caption of the
points are to the regression line. original mtcars dataset, first copy the data to a new data frame
Histograms visualize the distribution of a single variable. A with df <- mtcars.
histogram shows how often a certain measured value occurs or If the records have empty fields, this can lead to errors.
how many measured values fall within a certain interval. The That’s why it is a good idea to resolve this potential worry at
qplot command automatically creates a histogram if you only the start of the cleanup. Depending on how often empty fields
pass in one vector to plot. qplot(x) creates a simple histogram occur, you can either fill them with estimated values (imputa-
from x <- c(1, 2, 2, 3, 3, 4, 4, 4). tion) or delete them. The command from the second line of
The box plot, also known as a whisker diagram, is another Listing 5 removes all lines that contain at least one zero (also
type of chart. A box plot is a standardized method of display- NaN or NA).
ing the distribution of data based on a five-value summary: Records also often contain duplicates. If the duplicate is the
minimum, first quartile (Q1), median, third quartile (Q3), and result of a technical error in data retrieval or in the source sys-
maximum. In addition, a box plot highlights outliers and re- tem, you should first try to correct this error. R provides an
veals whether the data points are symmetrical and how closely easy way to clean up the dataset and assign the results to a
they cluster. new, clean data frame with the unique() command (Listing 5,
In R you can generate a box plot, for example, with last line).
qplot(). The best way to generate a box plot is with the
sample data from mtcars. To use the cyl column as a cate- Predictive Modeling
gory, factor() first needs to convert the values from numeric In reality, there are a variety of prediction models with a wide
variables to categorical variables. This is done with the range of parameters that provide better or worse results de-
factor() command (Listing 4). pending on the requirements and data. For an example, I’ll use
Thanks to the special display form that the geom="violin" pa- a dataset for irises (the flowers) – one of the best-known datas-
rameter sets here, you can see at first glance that, for example, ets for machine learning examples.
the vast majority of eight-cylinder engines can travel around 15 As an algorithm, I use a decision tree to predict the iris
miles on a gallon of fuel, whereas the more frugal four-cylinder species – given certain properties, for example, the length
engines manage between 20 and 35 miles with the same (Petal.Length) and width (Petal.Width) of the calyx. To do
amount (Figure 4). this, I first need to load the data, which already exists in an
R library (Listing 6, line 1).
Data Cleanup The next thing to do is to split the data into training and
Data cleanup examples are difficult to generalize, because the test data. The training data is used to train the model,
actions you need to take heavily depend on the individual da- whereas the test data checks the predictions and evaluates
taset. But there are a number of fairly common actions. For ex- how well the model works. You would typically use about 70
ample, you might need to rename cryptically labeled columns. percent of the data for training and the remaining 30 percent
The recommended approach is to first standardize the designa- for testing. To do this, first determine the length of the record
tions. Then change the column names with the colnames() using the nrow() function and multiply the number by 0.7
command. Then pass in the index of the column whose name (Listing 6, lines 2 and 3). Then randomly select an appropri-
you want to change in square brackets. The index of a particu- ate amount of data (line 5).
lar column can also be found automatically (Listing 5, first I have set a seed of 101 for the random value selection in
the example (line 4). If you set the same value for the seed,
you will see identical random values. Following this, split the
data into iris_train for training and iris_test for validation
(lines 6 and 7).

Listing 6: Prediction with Iris Data


01 > data(iris)

02 > n <- nrow(iris)

03 > n_train <- round(.70 * n)

04 > set.seed(101)

05 > train_indicise <- sample(1:n, n_train)

06 > iris_train <- iris[train_indicise, ]

07 > iris_test <- iris[-train_indicise, ]

08 > install.packages("rpart ")

09 > install.packages("rpart.plot")

10 > library(rpart)

11 > library(rpart.plot)

12 > iris_model <- rpart(formula = Species ~.,data = iris_


train, method = "class")
Figure 4: Miles per gallon for 4-, 6-, and 8-cylinder
13 > rpart.plot(iris_model, type=4)
vehicles.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 29


COVER STORY
R For Science

After splitting the data, you can train Listing 7: Accuracy Estimation
and evaluate the decision tree model. To 01 > iris_pred <- predict(object = iris_model, newdata = iris_test, type = "class")
do this, you need the rpart library. 02 > install.packages("caret")
rpart.plot visualizes the decision tree
03 > library(caret)
(lines 8 to 11). Next, generate the deci-
04 > confusionMatrix(data = iris_pred, reference = iris_test$Species)
sion tree based on the training data.
When doing so, pass in the Species col-
umn in order to predict which iris spe- Listing 8: Data Import
cies you are looking at (line 12). > df <- read.table("meine_datei.csv", header = FALSE, sep = ",")
One advantage of the decision tree is
> my_daten <- read_excel("my_excel-file.xlsx")
that it is relatively easy to see which pa-
rameters the model refers to. rpart.plot
lets you visualize and read the parameters (line 13). Figure 5 You can now also imagine how this algorithm could be ap-
shows that the iris species is setosa if the Petal.Length is plied to other areas. For example, you could use environmen-
greater than 2.5. If the Petal.Length exceeds 2.5 and the tal climate data (humidity, temperature, etc.) as the input,
Petal.Width is less than 1.7, then the species is probably ver- combine it with information on the type and number of de-
sicolor. Otherwise, the virginica species is the most likely. fects in a machine, and use the decision tree to determine the
The next step in the analysis process is to find out how ac- conditions under which the machine is likely to fail.
curate the results are. To do this, you need to feed the model
data that it hasn’t seen before. The previously created test Importing Data
data is used for this purpose. Then use predict() to generate If you want to analyze your own data now, you just need to
predictions based on the test data using the iris_model model import the data into R to get started. R lets you import data
(Listing 7, line 1). from different sources.
There are a variety of metrics for determining the quality of To import data from a CSV file, first pass the file name (in-
the model. The best known of these metrics is the confusion cluding the path if needed) to the read.table() function and
matrix. To compute a confusion matrix, first install the caret li- optionally specify whether the file contains column names. You
brary (lines 2 and 3), which will give you enough time for an can also specify the separator character for the fields in the
extensive coffee break even on a fast computer. Then evaluate lines (Listing 8, first line).
the iris_pred data (line 4). If the data takes the form of an Excel spreadsheet, you can
The statistics show that the model operates with an accuracy also import it directly. To do this, install the readxl library and
of 93 percent. The next step would probably be to optimize use read_excel() (second line) to import the data.
the algorithm or find a different algorithm that offers greater
accuracy. Conclusions
The R language is a powerful tool for analyzing and visualizing
scientific data. This article took a look at how to install R,
RStudio, and the various R libraries. I also described the vari-
ous data structures in R and introduced some advanced analy-
sis methods. Now you can jump in and start using R for your
own scientific data analyses. Q Q Q

Info
[1] CRAN: https://cran.r-project.org

[2] RStudio download: https://www.rstudio.com/products/rstudio

Author
Rene Brunner is the founder of Datamics, a consulting
company for Data Science Engineering, and Chair of the Digital
Technologies and Coding study program at the Macromedia
University. With his online courses on Udemy and his “Data
Figure 5: Visualizing the decision tree model with Science mit Milch und Zucker” podcast, he hopes to make data
the iris data. science and machine learning accessible to everyone.

QQQ

30 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


REVIEW
Distro Walk – Immutable Distros

The rise of immutable distros

Steadfast
Immutable distributions offer a layer of added security.
Bruce explains how immutable systems work and
discusses their benefits and drawbacks. By Bruce Byfield

T
he concept of immutable objects – Table 1: Selected Immutable Distros
objects that can be replaced but blendOS An Arch Linux-based distro suitable for beginners that runs
not edited – is not new to Linux. packages from multiple distros on the same desktop
Object-oriented program lan- Bottle Rocket A distro for use with Amazon Web Services
guages such as Rust, Erlang, Scala, carbonOS A Gnome-based distro that includes system updates
Haskell, and Clojure have immutable ob- CoreOS A distro used by Red Hat Enterprise Linux (RHEL)
jects, and many programming languages
Fedora Silverblue A variant of Fedora Workstation that is perhaps the most
allow immutable variables. Similarly, the popular immutable distro
chattr command has an immutable attri- Fedora Kinoite A Plasma-based variant of Fedora Workstation
bute for directories and files.
Fedora Sericea A variant of Fedora Workstation that uses the Sway window
In recent years, immutable systems manager
have emerged, originally for the cloud
Fedora CoreOS A distro designed for clusters (but operable as standalone)
or embedded devices, but now for and optimized for Kubernetes
servers and desktop environments as
Flatcar Container Linux A minimal distro that includes only container tools and no
well. Some of these distros are new, package manager
and many are based on major distribu- RancherOS A light, minimal system with immutability provided by read-
tions such as Debian, openSUSE, and only permissions
Ubuntu. All are seen as adding another NixOS An immutable system, plus rollbacks, system cloning, 80k
layer of security and most use contain- packages, preinstall package testing, and multiple versions
ers and universal packages, bringing of packages
these technologies to the average user Guix Similar to NixOS, but aimed at advanced users
for everyday use (see Table 1).
Talos Linux A distro designed for the cloud and use with Kubernetes
with a minimal installation
Author
Endless OS A Debian-based distro aimed at new users that works offline
Bruce Byfield is a computer journalist and
Photo by Egor Myznik on Unsplash

a freelance writer and editor specializing Nitrux A Debian and Plasma-based distro
in free and open source software. In openSUSE MicroOS A server-oriented distro with transactional updates via Btrfs
addition to his writing projects, he also Vanilla OS A Debian-based distro with emphasis on desktop and user
teaches live and e-learning courses. In his experience
spare time, Bruce writes about Northwest
Coast art (http://brucebyfield.wordpress. Ubuntu Core In development since 2014, a well-documented distro
com). He is also co-founder of Prentice specifically designed for embedded devices
Pieces, a blog about writing and fantasy at
Discontinued: k3os, a minimal distro for running Kubernetes clusters
https://prenticepieces.com/.

32 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


REVIEW
Distro Walk – Immutable Distros

atomic update during a system reboot


(i.e., the update must be applied all at
once or not at all). Often, each update
can be stored like a snapshot for
backup and may be chosen at bootup.
These images may be handled by an ap-
plication like Fedora Silverblue’s ostree
or through snapshots in a Btrfs filesys-
tem, as with openSUSE’s MicroOS.
But what about non-core components?
As you probably know, traditional pack-
age managers deal with one package at a
time, adding dependencies as needed.
Because a dependency might be an ap-
plication or library that is part of the
core system, in an immutable system,
this approach would only alter the sys-
tem until the next boot, when the
Figure 1: The blendOS desktop tool for creating containers. change would be lost and the non-core
package might cease to work. Instead,
The Immutable container, that is read-only. Once in- immutable distros often use a universal
Architecture stalled, this core system cannot be per- package system such as AppImage,
The structure of immutable systems is manently edited. Any editing attempt Flatpak, or Snap. Because dependencies
complicated and varies with the distri- will be lost once the system is rebooted. in a universal package contain their own
bution. While only an overview can be Unlike in traditional systems, not even a dependencies, they can be run without
given here, the general definition of an root user can alter this core. Instead, interfering with the immutable core.
immutable distro is a core operating the core can only be completely re- Should a problem somehow emerge re-
system, usually placed in a separate placed by what is described as an gardless, the system can be rolled back
REVIEW
Distro Walk – Immutable Distros

Figure 2: The Vanilla OS desktop tool for updates.

at boot. Alternatively, blendOS places with universal


traditional packages from each tradi- or containerized
tional distribution in a separate con- packages,
tainer, so that its immutable desktop changes are
can run multiple versions of the same harder to spread
package. from one appli-
How much of this structure is visible cation to
from the desktop varies considerably. another. Figure 3: The Vanilla OS desktop tool for managing
Some immutable distributions like Va- • Accident proof: AppImage packages.
nilla OS and blendOS include graphical System files
tools for such tasks as creating contain- cannot be altered by mistake. Atomic secure as its contents, so immutable
ers (Figure 1) and controlling updates updates eliminate partial updates, and distros can never be totally secure.
(Figure 2) and universal packages (Fig- snapshots allow rollbacks. There is always the chance that bugs
• Easier administration: Testing, trouble-
ures 3). In others like Fedora Silverblue, or security attacks can be introduced
the immutable aspects are hidden on shooting, and cloning are easier be- accidentally or deliberately when a
the desktop. For example, in Silverblue, cause of the more rigid structure. container is created. If that happens, it
/home is a symbolic link to /var/home, Perhaps the greatest advantage, though, could easily be missed out of a false
and the immutable structure is placed is that embedded and desktop develop- sense of security. For another, unlike
in /sysroot (Figure 4). The most obvi- ment are no longer as separated as they traditional packages, universal pack-
ous structure in any immutable distro have been in the past. In immutable sys- ages each contain their own libraries,
is usually the tool for updating, like tems, tools that once seemed relevant which may not be be practical on sys-
Silverblue’s ostree and utilities for mainly to embedded systems such as tems with low memory. Vanilla OS, for
managing containers. containers and universal packages are example, requires 50GB for storage.
given practical purposes in desktop Perhaps more importantly, immutable
The Immutable Advantage environments. desktops require more maintenance
Details can differ from the general de- than traditional package systems like
scription given here. However, all im- Possible Limitations .deb or .rpm. Instead of a single package
mutable distros share the same Like most new technologies, immuta- and its dependencies, in at least some
advantages: ble desktops are often overhyped. For cases, an entirely new system image
• Added security: Even if the core sys- this reason, I should stress that immu- must be created to avoid the unin-
tem is somehow cracked, any changes table desktops have their limits. For tended introduction of new problems.
will disappear upon reboot. Moreover, one thing, any container is only as Either more hands or more hours are
probably needed to assure quality. For
rolling distributions like Arch Linux,
whose emphasis is on the newest soft-
ware, immutable releases seem espe-
cially impractical, although some sort of
compromise with occasional immutable
releases might be possible.
Such concerns suggest that immutable
systems may not be suitable for every
situation. But if general and rolling re-
leases can coexist, there seems no rea-
Figure 4: Fedora Silverblue stores system images and other files for its son why immutable distros cannot find
ostree tool in /sysroot. a niche as well. Q Q Q

QQQ

34 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
AlmaLinux

AlmaLinux promises continued RHEL compatibility

Friendly Fork
Recent policy changes at Red Hat have upturned the RHEL clone community. AlmaLinux charts a new
path by shifting to binary compatibility and away from being a downstream RHEL build. By Amy Pettle

W
hen Red Hat discontinued the GPL, AlmaLinux is forging a differ- new RPM package manager, and then
CentOS and replaced it with ent path forward. published the code in the AlmaLinux
CentOS Stream in late 2020, AlmaLinux plans to maintain applica- repositories.
AlmaLinux stepped forward tion binary interface (ABI) compatibility Instead of updates and patches com-
to build a community downstream ver- to continue to provide the community ing from a single repository, AlmaLinux
sion of Red Hat Enterprise Linux (RHEL). with a forever-free Enterprise Linux so- now must gather them from multiple
In a desire to fill this void in the Enter- lution. (See the “New Path Forward” box sources and then compare, test, and
prise Linux ecosystem, CloudLinux col- for our interview with benny Vasquez, build the new release from these
laborated with the community to develop AlmaLinux OS Foundation Chair, to sources. To achieve ABI compatibility,
AlmaLinux OS as a downstream build of learn why they chose this route.) AlmaLinux will use CentOS Stream (the
RHEL. After the first stable release in upstream version of RHEL still available
March 2021, CloudLinux turned gover- 1:1 vs. ABI Compatibility to the public) and then get additional
nance of AlmaLinux OS over to the non- In 1:1 compatibility, a clone distribution code from Red Hat Universal Base Im-
profit AlmaLinux OS Foundation. From provides an exact copy of RHEL’s func- ages (UBIs) and upstream Linux code.
there, AlmaLinux chugged along for over tionality, behavior, and binary compati- In a recent talk at All Things Open [1],
two years providing the Enterprise Linux bility, including bug-to-bug compatibil- Vasquez noted that 99 percent of the
community with a forever-free Linux ity. It is an exact replica of RHEL minus packages would match RHEL source
distro while offering long-term stability RHEL’s branding and trademarks. code. Of this 99 percent, 75 percent will
and a production grade platform. With ABI compatibility, AlmaLinux be built from CentOS Stream or UBI im-
That all changed in June 2023 when guarantees that all apps developed for ages, while approximately 24 percent
Red Hat announced that RHEL-related RHEL or its clones will run on AlmaLinux will require manual patching.
source code would be restricted to Red without any modifications or extra work The remaining one percent that differs
Hat’s customer portal. CentOS Stream, on the part of the user. AlmaLinux will not from RHEL lies in the kernel patches.
an upstream version of RHEL that con- be an exact copy, but it will include kernel These kernel updates pose the biggest
tains experimental packages, would now and application compatibility. This also challenge because AlmaLinux can no
be the sole repository for public RHEL- means that AlmaLinux will not guarantee longer pull these updates from Red Hat
related source code releases. Because bug-to-bug compatibility. While some without violating licensing agreements.
Red Hat’s subscription agreement pro- users might find bugs not found in RHEL, Moving forward, AlmaLinux plans to pull
hibits customers from redistributing AlmaLinux also has the opportunity to in- kernel updates from various other
code, this move appeared to put an end clude bug fixes not yet addressed by Red sources, and, if all else fails, the Oracle
to downstream builds like AlmaLinux as Hat, as well as possibly offer new features releases (which are also based on RHEL).
well as other RHEL clones like Rocky not available in RHEL. On the upside, AlmaLinux can now
Photo by Alex Kondratiev on Unsplash

Linux and Oracle Linux. include comments in their patches for


Some were quick to predict the de- Adjustments greater transparency. Users will see
mise of these RHEL clones, but Alma- Prior to Red Hat moving RHEL source where the patch comes from, which
Linux, Rocky Linux, and others quickly code behind a paywall, any security up- was not an option before.
charted a path forward. While Rocky date or bug fix in RHEL resulted in Red Finally, AlmaLinux now asks users
Linux and the newly formed OpenELA Hat publishing the corresponding code who find bugs in AlmaLinux to attempt
(founded by Oracle, SUSE, and CIQ) to a public repository. AlmaLinux then to test and replicate the problem in
have promised to retain 1:1 compatibil- integrated this updated code into their CentOS Stream in order to let developers
ity with RHEL, citing their rights under own build and test system, produced a correct the issue in the right place.

36 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
AlmaLinux

New Additions available for AlmaLinux 8 and 9 as


No longer bound to 1:1 compatibility, Al- well as all Enterprise Linux users (e.g., This article was made possible by support
maLinux can set its own priorities rather RHEL, Rocky Linux, Oracle Linux, from AlmaLinux OS Foundation through
than following RHEL’s lead. AlmaLinux CentOS Stream). Once accepted to Linux New Media’s Topic Subsidy Program
now has the opportunity to include fea- EPEL, these packages will be removed (https://www.linuxnewmedia.com/Topic_Subsidy).
tures that meet the needs of its commu- from Synergy. At the time of writing,
nity, whether that is fixing bugs faster current packages include the Pantheon Info
(like the AMD microcode exploits [2]) Desktop Environment and the Warpina- [1] benny Vasquez talk at All Things
or adding new features. tor app. Community members can re- Open 23: https://www.youtube.com/
In August 2023, AlmaLinux added quest packages via the AlmaLinux Pack- watch?v=Jjda39dlu7I
two new repositories, Testing and Syn- aging chat channel in Mattermost [4]. [2] AMD microcode exploits:
ergy [3]. Testing, currently available https://www.amd.com/en/resources/
for AlmaLinux 8 and 9, offers security Conclusion product-security/bulletin/
updates before they are approved and Despite Red Hat making it more difficult amd-sb-7005.html
implemented upstream. AlmaLinux to use RHEL code, AlmaLinux has ad- [3] Testing and Synergy repositories:
has invited community members to justed course, relying on ABI compatibil- https://almalinux.org/blog/new-
help test these updates. (As per usual, ity to deliver a RHEL alternative for the repositories-for-almalinux-os-synergy-
Testing is not recommended for pro- Enterprise Linux ecosystem. Moving and-testing/
duction machines.) forward, AlmaLinux plans to continue [4] Packaging chat channel:
Synergy contains packages requested contributing upstream to CentOS https://chat.almalinux.org/login?
by community members that currently Stream, Fedora, and Linux in general. redirect_to=%2Falmalinux%2Fchannel
aren’t available in RHEL or Extra Pack- At the time of writing, AlmaLinux has s%2Fengineeringpackaging
ages for Enterprise Linux (EPEL, a set announced the first releases using the new
of extra software packages maintained build process, beta versions of AlmaLinux Author
by the Fedora SIG that are not available 8.9 and 9.3, so you can see for yourself Amy Pettle is an editor for ADMIN and
in RHEL or CentOS Stream). Synergy is how ABI compatibility works. Q Q Q Linux Magazine.

New Path Forward


We talked to benny Vasquez, chair of the and potentially hurt users and the Enter- like ours, rebuilding someone else’s code
AlmaLinux OS Foundation, about their prise Linux ecosystem overall. By remain- doesn’t take as much effort. Technically,
decision to shift to ABI compatibility in ing focused on what is best (though not building from Stream takes more time for
the wake of the changes at Red Hat. easiest), and adapting to the ecosystem as sure, but the public perception is that it
Linux Magazine (LM): What prompted it is today, we will provide a better and will lead to greater divergence from
AlmaLinux to choose ABI over 1:1 com- more stable operating system. RHEL. I think folks will be seriously happy
patibility with RHEL? LM: What opportunities does the ABI about what they find as we release the
benny Vasquez (bV): The short answer is route offer over 1:1 compatibility? new versions, namely, the consistency,
our users. Overwhelmingly, our users bV: By liberating ourselves from the 1:1 stability, and security that they’ve come
made it clear that they chose AlmaLinux promise, we have been able to do a few to expect from us.
for its ease of use, the security and stabil- small things that have proven to be a LM: Since you are no longer bound to
ity that it provides, and the backing of a good testing ground for what will come in conform to 1:1 compatibility, what do you
diverse group of sponsors. All of that to- the future. Specifically, we shipped a cou- see in AlmaLinux’s future?
gether meant that we didn’t need to lock ple of smallish, but extremely important,
bV: We will continue on our goal of be-
ourselves into copying RHEL, and we could security patches ahead of Red Hat, offer-
coming the home for all users that need
continue to provide what our users needed. ing quicker security to the users of Alma-
Enterprise Linux for free, but in the next
Moreover, we needed to consider what Linux. We also announced two additional
repositories. One for testing and one for year I expect that we will see an expan-
our sponsors would be able to help us
new packages that aren’t available in our sion in the number of kernels we support
provide, and how we could best serve the
upstream or in EPEL. and see some new and exciting SIGs
downstream projects that now rely on Al-
spun up around other features or use
maLinux. The rippling effects of any deci- This also opens the door for other features
cases, as the community continues to
sion that we make are beyond measure at and improvements that we could add back
standardize on how to achieve their goals
this point, so we consider all aspects of in or change, as our users need. We have
collectively.
our impact and then move forward with already seen greater community involve-
confidence and intention. ment, especially around these ideas. LM: What do you think your relationship
LM: How did AlmaLinux’s mission of im- LM: Does the ABI route pose any extra with Red Hat will look like moving forward?
proving the Linux ecosystem for every- challenges? bV: Ultimately our goal is to improve the
one influence this decision? bV: The obvious one is that building from Enterprise Linux ecosystem, and we’ll
bV: We strongly believe that the soul of CentOS Stream sources takes more ef- welcome anyone who is actively working
open source means working together, pro- fort, but I think the more important chal- toward that goal. We have loved seeing
viding value where there is a gap, and help- lenge (and the one that will only be the positive infusion of energy that the
ing each other solve problems. If we partici- solved with consistency over time) is the AlmaLinux users have been able to build
pate in an emotional reaction to a busi- one of proving that we will be able to de- on and are excited to see that continue
ness’s change, we will then be distracted liver on the promise. With a community to expand through the entire ecosystem.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 37


IN-DEPTH
Acoustic Keyloggers

An introduction to acoustic keyloggers

Keyboard
Eavesdropping
Is someone listening in on your typing? Learn more about how software (specifically Zoom) to record
keystrokes when attendees logged into
acoustic keyloggers work. By Chris Binnie various accounts during the meeting.
Secondly, when presented with the above

W
training data, the AI’s success rate was in-
ith all the discussion about drive. Once installed, anything typed on credible. Overall the success rate was a
the application of artificial in- the keyboard attached to the infected com- staggering 95 percent. Zoom calls achieved
telligence (AI) in cybersecu- puter is saved and forwarded to the at- a 93 percent success rate and Skype man-
rity, we are reminded that tacker, giving them access to passwords, aged 91.7 percent accuracy, according to the
criminals are paying close attention to AI's credit card numbers, bank account infor- Bleeping Computer article.
advances. New functionality identified by mation, and more. Of course, today the Keyloggers can be deployed in many dif-
British researchers [1] involves training a malware payload can be just as easily de- ferent ways. For instance, Endpoint Detec-
deep learning model to listen in on the livered by unwelcome JavaScript unsus- tion and Response (EDR) technology was
acoustic sounds made by keyboards when pectingly executed by the browser when found to have missed the presence of
a user is typing. The model then records you visit an infected web page. BlackMamba keylogging malware. Accord-
the audio from the typing and determines There are some legitimate (though ing an article in Dark Reading [2], such an
what was typed. Applications include re- contentious) uses of this technology. For attack “demonstrates how AI can allow
cording users logging in to sensitive online example, parents might monitor their the malware to dynamically modify be-
accounts or entering payment details. child’s tablet usage or a corporate em- nign code at runtime without any com-
However, this type of attack does not re- ployer might keep tabs on an employee’s mand-and-control (C2) infrastructure, al-
quire AI to do damage. Keylogging tools al- computer usage. lowing it to slip past current automated se-
ready exist that can listen in on your typ- A recent article on the Bleeping Com- curity systems that are attuned to look out
ing. While it might sound paranoid, you puter website [1] regarding the British for this type of behavior to detect attacks.”
might be surprised how advanced such deep learning acoustic attack study makes The Dark Reading article concludes
tools have become, even without machine two fascinating points. Firstly, the study that without extensive research combined
learning (ML) removing much of the re- outlines the baseline where a training al- with effort from the security industry, so-
quired “training” time for an acoustic key- gorithm receives enough training data to lutions will struggle to keep us secure.
Lead Image © Sergey Galushko, 123RF.com

logger to fully recognize keyboard sounds. recognize the sound of each keystroke. Now that your fight-or-flight senses
To get you up to speed on keylogging, Bleeping Computer noted: “The research- are tingling suitably, I will show you
I will explain how keylogging works and ers gathered training data by pressing 36 some tools in action.
look at some of the tools currently avail- keys on a modern MacBook Pro 25 times
able on Linux. each and recording the sound produced by Can’t Hear You
each press.” Devices such as phones, or Before looking at acoustic keylogging
What’s All the Fuss? anything with a reasonable quality micro- tools, I’ll cover a non-acoustic keylogger,
Popularized in movies, the logging of key- phone (also infected by malware, most logkeys [3], to show how older tools work
strokes often involves malware being in- likely) are used for the recording. The as well as some of the jigsaw pieces in-
stalled on a target machine with a USB study also used videoconferencing volved with keyloggers. The logkeys tool

38 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
Acoustic Keyloggers

records all common character and func- the logkeys software with the following $ logkeys --start U
tion key presses as well as recognizes the command (note the two dots): --output watching_you.log

Shift and AltGr modifiers.


To use logkeys, you first need to install $ ../configure Then, in the second terminal window, I’ll
the build tools required on Debian Linux move into the /tmp directory and follow
derivatives (such as Ubuntu Linux): Finally, you need to compile logkeys the output with the tail command using:
for your system as follows, before
$ apt install build-essential U installing it: $ tail -f watching_you.log

autotools-dev autoconf kbd $ make

$ make install Listing 3 shows that something is re-


Next, you need to clone the repository corded from each keystroke.
and move into the logkeys directory with The command compiles with ease on You’ll notice that Listing 3 isn’t very
the following command: my Ubuntu Linux 22.04 laptop (if you easy to read thanks to the fact that I use
want to remove logkeys, see the “Un- a UK keyboard. If you use a standard US
$ git clone https://github.com/kernc/U installing logkeys” box). In order to keyboard, then the following command
logkeys.git check that the logkeys binary is avail- should work for you:
$ cd logkeys able to my user’s path (or the root
user in this case), I start typing the $ logkeys --start --us-keymap U
You are now ready to create some build logkeys command and then tab-com- --output watching_you.log

files with the following script and enter plete it as shown in Listing 2. Happily,
the directory: the logkeys help file output appears as I need to stop logkeys in order to change
hoped. the keyboard mapping. During testing, I
$ ./autogen.sh Now I will run a test to see if I can get used the pkill command to stop logkeys
$ cd build logkeys to work using instructions from while it was running; there’s almost cer-
the documentation [4]. For this test, I tainly a more graceful way of stopping
Then, you need to check that the build will need two terminals. In the first ter- the daemon however.
environment is sound before compiling minal, I will move into the /tmp directory For those not familiar with pkill, it’s a
to keep the root user’s home directory simple route to take instead of using the
Uninstall logkeys tidy and then create an empty logfile kill command. Be very careful how you
Since logkeys is a surveillance tool, you with the following commands: use it as the root user. It essentially saves
need a way to reliably uninstall it. You time spent looking up a process’s PID to
can do this from the build repo directory $ cd /tmp terminate it. Its purpose is to match the
(which is /root/logkeys/build in my case $ echo "" > watching_you.log human-readable name of a process be-
as I’m cloning the repo into the /root di-
fore stopping it ungracefully. For more
rectory) using the command in Listing 1.
Next I will start logging output with: information on pkill, run man pkill.

Listing 1: Uninstalling logkeys Listing 2: Checking the logkey Binary


$ make uninstall $ logkeys --help

Making uninstall in src Usage: logkeys [OPTION]...

Log depressed keyboard keys.


make[1]: Entering directory '/root/logkeys/build/src'

( cd '/usr/local/bin' && rm -f logkeys llk llkk )


-s, --start start logging keypresses
make[1]: Leaving directory '/root/logkeys/build/src' -m, --keymap=FILE use keymap FILE

[...] [...]

Listing 3: Keystrokes Being Saved


Logging started ...

2023-08-13 12:58:05+0100 > <LShft><LCtrl>E<LAlt><Tab>

2023-08-13 12:58:13+0100 > <Enter><Up><LShft>Xu <BckSp>v yx [ ]q?ux? eqwcyx[g yx?u v <LShft>Yjgg cuq <BckSp><BckSp>#q yxeu
esq <LShft>"neci<LShft>" ?ywq?euwr [x? e[yg esq gua aygqv<BckSp><BckSp><BckSp><BckSp><BckSp>?ygq
[] ]u<LShft>H

2023-08-13 12:58:41+0100 > <Enter>

2023-08-13 12:58:41+0100 > <Enter><LShft>$ e[yg -<BckSp>?? neci

2023-08-13 12:58:52+0100 > <Enter>?<BckSp>e[yg -? [<Tab>?<Tab>

2023-08-13 12:58:56+0100 > <Enter><LCtrl><LShft>?

[...]

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 39


IN-DEPTH
Acoustic Keyloggers

Listing 4: Keyboard Layouts For logkeys, simply enter:


ca_FR.map cs_CZ.map de_CH.map de.map en_GB.map en_US_dvorak.map en_US_ubuntu_1204.

map es_AR.map es_ES.map fr_CH.map $ pkill logkeys

fr-dvorak-bepo.map fr.map hu.map it.map no.map pl.map pt_BR.map pt_PT.map ro.map


Once stopped, I use the UK keyboard lay-
ru.map sk_QWERTY.map sk_QWERTZ.map sl.map sv.map tr.map
out with logkeys via (note the -m switch):

Listing 5: logkeys Captures Every Single Key Press $ logkeys --start U


chris@Xeo:/tmp$ tail -f watching_you.log --output /tmp/watching_you.log U
Logging started ... -m /root/logkeys/keymaps/en_GB.map

2023-08-13 13:35:23+0100 > <PgUp>


The keyboard layouts (shown in Listing 4)
are included in /root/logkeys/keymaps, so
2023-08-13 13:35:29+0100 > <Enter><LShft>I am watching you very closely
you don’t need to customize them.
2023-08-13 13:35:42+0100 > <Enter>
Using the UK keyboard layout, List-
ing 5 now displays actual typed text in-
Listing 6: Docker Compose build Output stead of gobbledygook. It’s a little scary
$ docker-compose build how accurate logkeys is if you look
through the logfile for unusual key com-
[...] binations, that you don’t realize you reg-
ularly use.
???????????????????????????????????????? 9.7/9.7 MB 9.8 MB/s eta 0:00:00 I will leave you to experiment with
Collecting psycopg2-binary==2.8.6 logkeys. If you want to compare features
Downloading psycopg2_binary-2.8.6-cp38-cp38-manylinux1_x86_64.whl (3.0 MB) with other keyloggers, see lkl [5] and
???????????????????????????????????????? 3.0/3.0 MB 9.3 MB/s eta 0:00:00 uberkey [6].
Collecting pytest==6.2.2 An important takeaway: Keyboard lay-
Downloading pytest-6.2.2-py3-none-any.whl (280 kB) out can be important to more traditional
??????????????????????????????????????? 280.1/280.1 kB 3.1 MB/s eta 0:00:00 keyloggers as well as the logfile setup.
Collecting scikit-learn==0.24.1
Downloading scikit_learn-0.24.1-cp38-cp38-manylinux2010_x86_64.whl (24.9 MB)
Sorry, I Missed That
???????????????????????????????????????? 24.9/24.9 MB 9.0 MB/s eta 0:00:00
Acoustic keylogging, which is a form
Collecting scipy==1.6.0
of a side-channel attack [7], uses the
Downloading scipy-1.6.0-cp38-cp38-manylinux1_x86_64.whl (27.2 MB)
audio signal to determine what the
???????????????????????????????????????? 27.2/27.2 MB 9.5 MB/s eta 0:00:00
user is typing. Shoyo Inokuchi created
[...]
acoustic-keylogger [8] as part of his un-
dergraduate studies and it offers a good
Listing 7: Two Container Images way to see how tools like this work.
$ docker images Although Inokuchi’s GitHub repo
REPOSITORY TAG IMAGE ID CREATED SIZE
hasn’t been updated for three years,
the slick installation process (I chose
acoustic-keylogger_env latest 3608317074d5 4 minutes ago 4.06GB
the Docker route) worked flawlessly.
python 3.8 d114ab2cf5bc 3 weeks ago 997MB
However, I wasn’t sure how to view
and analyze the results afterwards.
Listing 8: Running docker-compose up To install acoustic-keylogger (which
$ docker-compose up takes about 594MB of disk space), enter:

$ git clone https://github.com/shoyo/U


Creating network "acoustic-keylogger_default" with the default driver
acoustic-keylogger.git
Pulling db (postgres:11)...
Cloning into 'acoustic-keylogger'...
11: Pulling from library/postgres
[...]
bff3e048017e: Pull complete

[...]
$ cd acoustic-keylogger/

Listing 9: Output Notes


env_1 | To access the notebook, open this file in a browser:
env_1 | file:///root/.local/share/jupyter/runtime/nbserver-1-open.html
env_1 | Or copy and paste one of these URLs:
env_1 | http://09815db9c0c3:8888/?token=c55389826f2c1a66819428bad3e6d75a9f91eda5deccded7
env_1 | or http://127.0.0.1:8888/?token=c55389826f2c1a66819428bad3e6d75a9f91eda5deccded7

40 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
Acoustic Keyloggers

Figure 1: The dashboard courtesy of Jupyter.

Figure 2: acoustic-keylogger listens for keystrokes.

To use Docker Compose, you need to in-


stall it as follows:

$ apt install docker-compose

which then installs the following new


packages:

bridge-utils containerd docker-compose U


docker.io pigz python3-attr U
python3-docker python3-dockerpty U
python3-docopt python3-dotenv U
python3-jsonschema python3-pyrsistent U
python3-texttable python3-websocket U
runc ubuntu-fan

This added another 295MB of disk space.


You also need to check that Docker En-
gine installed successfully with:

$ apt install docker.io

Now, you are ready to run the Docker Figure 3: Keystroke sounds generated by a MacBook Pro 2016 (source:
Compose build command, whose output https://github.com/shoyo/acoustic-keylogger).

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 41


IN-DEPTH
Acoustic Keyloggers

Listing 10: Installed Packages for kbd-audio


libasound2-dev libblkid-dev libdbus-1-dev libdecor-0-0 libdecor-0-dev libdecor-0-plugin-1-cairo libdrm-dev libegl-dev
libegl1-mesa-dev libffi-dev libgbm-dev

libgl-dev libgles-dev libgles1 libglib2.0-dev libglib2.0-dev-bin libglu1-mesa-dev libglvnd-core-dev libglvnd-dev libglx-dev


libibus-1.0-dev libice-dev libmount-dev libopengl-dev libpciaccess-dev libpcre16-3 libpcre2-16-0 libpcre2-dev
libpcre2-posix3 libpcre3-dev libpcre32-3 libpcrecpp0v5 libpthread-stubs0-dev libpulse-dev libsdl2-2.0-0 libsdl2-dev
libselinux1-dev libsepol-dev libsm-dev libsndio-dev libsndio7.0 libudev-dev libwayland-bin libwayland-dev libx11-dev
libxau-dev libxcb1-dev libxcursor-dev libxdmcp-dev libxext-dev libxfixes-dev libxi-dev libxinerama-dev libxkbcommon-dev
libxrandr-dev libxrender-dev libxss-dev libxt-dev libxv-dev libxxf86vm-dev pkg-config uuid-dev x11proto-dev
xorg-sgml-doctools xtrans-dev

Listing 11: Additional Installation Steps Listing 12: Running make


$ git clone https://github.com/ $ make
ggerganov/kbd-audio

$ cd kbd-audio
[ 2%] Building CXX object CMakeFiles/Core.dir/common.cpp.o
$ git submodule update --init
[ 4%] Building CXX object CMakeFiles/Core.dir/audio-logger.cpp.o
$ mkdir build && cd build
[ 6%] Linking CXX static library libCore.a
$ apt install cmake -y

$ cmake .. # leave the dots in place [ 6%] Built target Core

[ 8%] Building CXX object CMakeFiles/Gui.dir/common-gui.cpp.o

[ 10%] Building CXX object CMakeFiles/Gui.dir/imgui/imgui.cpp.o


is shown in Listing 6. This installs all of
env’s dependencies (i.e., Jupyter, Tensor- [...]

flow, NumPy, etc.) and mounts your


local filesystem alongside the env Docker [100%] Linking CXX executable compress-n-grams
container filesystem. After running the
[100%] Built target compress-n-grams
build command, you need to confirm
that two container images are present as
shown in Listing 7.
Then you bring up the database,
along with the Python environment, as
shown in Listing 8. The Python envi-
ronment makes use of the Jupyter
Notebook [9]. Listing 9 shows addi-
tional output notes for accessing the
Jupyter Notebook.
To keep the environment up and run-
ning, you must leave the terminal open
and untouched. The final URL in List-
ing 9 shows the web interface (Figure 1).
To run tests, make sure that you are
still in the cloned repository directory
(acoustic-keylogger/) and use the fol-
lowing command:
Figure 4: Capturing keyboard audio.
$ docker-compose run env pytest U
-q tests

Figure 2 shows the output, with some


user input at the top (where I randomly
typed keys). Incidentally, my CPU load
went up and fans started whirring on my
old (8-core CPU) laptop when running
this command.
I looked around the container’s filesys-
tem and in Jupyter but couldn’t find
where the analysis of the data was
stored. Inokuchi states that reading the
external documentation isn’t a priority Figure 5: Playing back the recorded audio.

42 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
Acoustic Keyloggers

$ ./record-full output.kbd

Figure 4 shows an excerpt of the record-


ing output.
To play back the keystrokes, run the
following command in another terminal,
again in the same directory:

$ ./play-full output.kbd

Figure 5 shows what kbd-audio recorded.


When I played back the audio from my
recording, I could hear my erratic typing
noises with external ambient sounds
Figure 6: After receiving and analyzing the keyboard input, keytap cleverly faded out.
highlights the relevant keys. The kbd-audio GitHub repo offers ad-
vice on how to get graphical output
for using acoustic-keylogger, but it’s rel- During my installation of kbd-audio, from its acoustic keylogging activities.
atively clear that the process is working I used the commands in Listing 11, There is also an easy-to-use online
as expected. which differ slightly from the docu- demo [11] for kbd-audio’s keytap tool.
In the process of using acoustic-key- mentation because I needed additional Using this demo, I entered a few lines
logger, I discovered that the way that packages. Listing 11 resulted in lengthy of text and hit the Predict button, and
keys are depressed and how keys spring output which completed successfully, a graphical representation appeared for
back affects how the audio is captured. as seen here: some of the typed characters as shown
According to Inokuchi’s research, the in Figure 6. The output in Figure 7
sounds emitted by the keys can be clus- -- Configuring done shows how keytap learns from the
tered by their position on the keyboard. -- Generating done sounds it receives. Finally, a YouTube
Figure 3 shows the results for a MacBook -- Build files have been written to: U video [12] on keytap provides addi-
Pro 2016. /root/kbd-audio/build tional information.
As mentioned earlier, depressing a key
Smoking Keyboards Finally, I ran the make command to com- on a keyboard and it springing back is
Another acoustic keylogger, kbd-audio pile the configured build files as shown how sounds are analyzed. Figure 8
[10] by Georgi Gerganov, offers a collec- in Listing 12. Because I cloned the repos- shows kbd-audio’s representation of
tion of tools for capturing and analyzing itory under the root user’s home direc- what that looks like in a sound file.
acoustic audio. tory, it was important that the compiled
You can install kbd-audio with ease as commands were executed under the re- Two’s a Crowd
follows on Ubuntu Linux 22.04: po’s build directory (in my case, /root/ You’ll find two other evolutions of keytap
kbd-audio/build). in the kbd-audio repo. The second evolu-
$ apt install libsdl2-dev -y To begin surveilling ambient noise in tion, keytap2, does not require training
the room (turn up your microphone to data. (I’m sure you can see the
This pulls down the packages shown in maximum volume
Listing 10, thankfully with a small disk for the best results),
footprint of 54.2MB. use:

Figure 8: The ups and downs of keys when typing


Figure 7: The learning process occurring under the (source: https://ggerganov.github.io/jekyll/update/
hood for keytap. 2018/11/30/keytap-description-and-thoughts.html).

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 43


IN-DEPTH
Acoustic Keyloggers

To see how keytap3 works, you can


watch a 90-second YouTube video [17].
If you’re not concerned about acoustic
keylogging after watching this video,
then you are clearly less concerned with
cybersecurity than I am.
You can also try out keytap3 using an
online GUI [18]. To get started with the
demo, press the Init button and then
provide your browser with the correct
permissions when prompted.
Finally, an online test [19] lets you
check your keyboard’s security. You type
100 characters and then press Init to get
your results (Figure 9). You can also play
back your recording over your speakers
if desired. In testing my keyboard, I
found the results worrying but not fully
accurate. I suspect using old hardware is
a blessing in this case.

Conclusions
I have demonstrated a number of key-
Figure 9: The results of a keyboard vulnerability test. logging tools ranging from those that
capture key presses to those that record
significant benefits of this iteration of the You can test out keytap2 in Gerganov’s typing audio. Even in their current iter-
tool.) Instead of using training data, key- Capture The Flag (CTF) competition [15], ations, these tools should give you
tap2 references statistical information in where successful users enter a Hall of pause. For some tips on protecting your-
relation to the n-gram frequencies in- Fame. A keytap2 online demo [16] offers self from keyloggers, I recommend
volved. An n-gram is a series of adjacent helpful instructions to get you up and checking out this cursory discussion on
letters [13]. For a treatise on how keytap2 running after clicking the Init button. the topic [20].
works, see [14]. As AI advances over the next few
Three and Magic Numbers years, keylogging tools will likely
Author The final version in the kbd-audio repo is evolve. Until then, you might consider
Chris Binnie is a Cloud Native Security keytap3, which improves on the algo- how many devices in your home have
consultant and co-author of the book Cloud rithm and provides better n-gram statis- a microphone and perhaps reduce
Native Security: https://www.amazon.com/ tics. In addition, keytap3 no longer re- them in number. You might also want
Cloud-Native-Security-Chris-Binnie/dp/ quires manual intervention during text to sign out of your online accounts
1119782236. recovery – it is fully automated. during video calls. Q Q Q

Info
[1] “New acoustic attack steals data from [5] lkl: [14] n-gram frequencies:
keystrokes with 95% accuracy” by Bill https://sourceforge.net/projects/lkl https://github.com/ggerganov/
Toulas, Bleeping Computer, August 5, [6] uberkey: kbd-audio/discussions/31
2023: https://www.bleepingcomputer. https://linux.die.net/man/8/uberkey
[15] CTF challenge: https://ggerganov.
com/news/security/new-acoustic- [7] Side-channel attack: github.io/keytap-challenge
attack-steals-data-from-keystrokes- https://en.wikipedia.org/wiki/
with-95-percent-accuracy Side-channel_attack [16] keytap2 demo:
[2] “AI-Powered ‘BlackMamba’ Keylog- [8] acoustic-keylogger: https://keytap2.ggerganov.com
ging Attack Evades Modern EDR Se- https://github.com/shoyo/ [17] keytap3 demo:
curity” by Elizabeth Montalbano, acoustic-keylogger https://youtu.be/5aphvxpSt3o
Dark Reading, March 8, 2023: [9] Jupyter: https://jupyter.org
[18] keytap3 GUI:
https://www.darkreading.com/ [10] kbd-audio: https://github.com/ https://keytap3-gui.ggerganov.com
endpoint/ai-blackmamba- ggerganov/kbd-audio
keylogging-edr-security [19] keytap3 test:
[11] kbd-audio demo:
https://keytap.ggerganov.com https://keytap3.ggerganov.com
[3] logkeys:
https://github.com/kernc/logkeys [12] keytap demo: https://www.youtube. [20] Prevention tips: https://security.
[4] logkeys documentation: https:// com/watch?v=2OjzI9m7W10 stackexchange.com/questions/
github.com/kernc/logkeys/blob/ [13] n-gram: 119730/targeted-acoustic-
master/docs/Documentation.md https://en.wikipedia.org/wiki/N-gram keylogging-attack-prevention

44 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
Command Line – neofetch

A command-line
system information tool

System in a
Nutshell
Neofetch displays system information about your hardware, operating sytem, and desktop
settings in visually appealing output perfect for system screenshots. By Bruce Byfield

L
inux has never lacked applications Little wonder, then, that in recent years left of Figure 1 is an ASCII rendition of the
that display system information, neofetch has found its way into most installed distribution’s logo. On the right
but perhaps the most comprehen- distributions. Not only is it a useful sum- are 15 system statistics. Which statistics
sive tool is neofetch [1], a Bash mary of system information, supporting are shown, the details of each statistic, and
script that displays the current informa- a wide array of hardware and software, the general layout are all customizable
tion about hardware, operating systems, but, as its GitHub page notes, its visually either from the command line or from
and desktop settings. The information is appealing output is also useful in screen- .config/neofetch/config.conf in the user’s
presented by default in a somewhat hap- shots of your system. home directory (Figure 2). At the bottom,
hazard order, which can be compensated For many, the output of the bare com- a line of colored blocks does nothing ex-
for by a high degree of customization. mand may be enough (Figure 1). On the cept to mark the end of the display.

Photo by Mockup Graphics on Unsplash

Figure 1: The neofetch’s default output: In addition to a wide range of system information, it includes an
ASCII rendering of the distribution logo.

46 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
Command Line – neofetch

off in a space-sep- for a careful selection of the statistics


arated list. In ad- that are most useful to them. Neofetch
dition, some op- can also be called on in scripts, using the
tions have multi- bare command plus one or two options.
ple settings. Some The man page gives this example:
stats display on
separate lines, memory="$(neofetch memory)"; U
while others sim- memory="${memory##*: }"

ply add a few


characters to a de- or
fault line.
Note that neo- IFS=$'\n' read -d "" -ra info U
fetch is not a < <(neofetch memory uptime wm)

Figure 2: Neofetch creates a configuration file for monitor that con- info=("${info[@]##*: }")

each user. stantly updates


the information it The Configuration File
Display Options gives, like top does. It displays only the Neofetch creates .config/neofetch/con-
Neofetch has dozens of options, most of current information when it is run. fig.conf in a user’s home directory the
which are self-explanatory. They cover a Most of this information is available in first time it is used. Statistics can also be
bewildering array of statistics, covering your the desktop settings, or through cut and pasted to rearrange them. The
every aspect of a system (Table 1). After other commands like uname, but neofetch .config file gives examples, but online
each option, you can specify whether its provides a convenient summary. Most help is available if needed [2], including
display is off or on. Alternatively, you users will probably not care to scroll a neofetch Reddit [3]. A configuration
can use --disable OPTION to turn options through all the options, opting instead setting can be overridden by a com-
mand-line option.
Table 1: Selected neofetch Options
Option Description Values (All take on/off) Is neofetch an Orphan?
Hardware & OS Some users are concerned that neofetch
--title_fqdn Full domain name in title has had no updates for almost two
--os_arch System architecture years. The reason may be that there is
--package_managers Includes universal nothing new to add. Consequently, de-
packages spite the fact that neofetch still works
--speed_type type CPU speed current, min, max, bios, scaling_ on most systems, many are looking for
current, scaling_min, scaling_max an alternative. Many coding languages
--cpu_brand CPU manufacturer have their own version of neofetch, in-
--cpu_cores CPU core type logical, physical cluding Java, Pascal, C++, Perl, Rust,
--cpu_speed CPU speed Lua, and Python, but the newest and
--cpu_temp CPU temperature C (Celsius), F (Fahrenheit)
most popular is fastfetch [4]. Written in
--refresh_rate Displays refresh rate on
C, fastfetch is a close clone and faster
each monitor than neofetch, but remains a work in
--gpu_brand GPU brand AMD, NVIDIA, Intel progress. Fastfetch lacks a man page,
--disk_show Filesystems to show /dev or /path
and some of neofetch’s options are cur-
--disk_percent rently unsupported for some distribu-
Memory used on disk
--cpu_display MODE
tions, so it is only starting to be in-
Bar mode bar, infobar, barinfo
--memory_display MODE
cluded in distributions’ repositories. In-
Bar mode bar, infobar, barinfo
stead, users must compile fastfetch sep-
--battery_display MODE Bar mode bar, infobar, barinfo
arately or hunt for a suitable package.
Desktop Environment
For now, most users should probably
--de_version Show desktop
stick to neofetch if possible. Q Q Q
environment
--gtk2 Enable/disable GTK2 Theme, font, icons Info
--shell_version Show shell version
[1] neofetch: https://github.com/
Text Format dylanaraps/neofetch
--colors COLORS Comma-separated list Changes color in this order: title, @, [2] Config file: https://github.com/
of colors underline, subtitle, colon, info
dylanaraps/neofetch/wiki/Config-File
--color_block Toggle color blocks
[3] neofetch Reddit:
--ascii_distro DISTRO ASCII image for https://www.reddit.com/r/sysfetch/
distribution
[4] fastfetch packages: https://github.com/
--source SOURCE Source for logo image fastfetch-cli/fastfetch/releases

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 47


IN-DEPTH
datamash

Data processor

Open Source Gem


A little-known, very powerful data processor for your scripts, datamash makes long, complex
calculations simple. By Marco Fioretti

G
NU datamash [1] is a com- automatically in shell scripts, and even Practice with Sample Files
mand-line program capable of directly attach it to other programs (in- Datamash does not offer many sample
analyzing, summarizing, or cluding itself!) via Unix pipes. files for learning and testing its many
transforming in various ways Besides, in almost all the cases I have features. At the time of writing, the
tables of numbers, with or without text, seen or can imagine, datamash does datamash package only includes four
stored inside plaintext files. For these what you need with less typing, possibly sample files. On Linux, depending on
kinds of tasks, datamash is often a faster, a lot less. Last but not least, datamash your distribution, you can find them in
more productive alternative to tools like lets you easily perform basic quality /usr/share, /usr/local/share, or /usr/
AWK, sed, or any scripting language. checks on raw data. I'll show you how to share/doc/. If these aren’t enough, you
Just like those other tools, datamash is do all this from scratch, starting with the can generate as many sample files as
a good team player, in the traditional basic options and ways of working with desired with simple scripts such as the
Unix and Linux sense: You can use data- datamash and then moving to more one in Listing 1, which is a snippet of
mash interactively at the prompt, complicated examples. code from another project that I quickly
adapted for this article.
Listing 1: Generating datamash Test Files Listing 1 creates a table of random inte-
01 #! /usr/bin/perl
gers, with the number of lines and col-
02
umns defined in lines 5 and 6. As is, List-
ing 1 will generate 2,000 random integers
03 use strict;
between 1 and 1,000 and print them sep-
04
arated by tabs (line 14).
05 my $LINES = 500;
More precisely, the counter $I initial-
06 my $COLS = 4;
ized in line 9 is incremented each time a
Lead Image © Mikhail Avlasenko, 123RF.com

07 my $CNT = $COLS*$LINES;
number is added, at the very end of
08
line 14. However, each time the coun-
09 my $I = 0; ter’s current value is also divided by the
10 desired number of columns and assigned
11 while ($I <$CNT) { to the $NL variable, in line 13. With four
12 columns, this means that $NL will cycle
13 my $NL = $I % $COLS; between the values (0,1,2,3) until the
14 printf "%4.4s%s", int rand(1001), (($COLS -1) == $NL) ? "\n" : "\t"; $I++; program ends, making the comparison
15 } in line 14 (($COLS -1) == $NL) true only
once every four iterations of the loop.

48 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
datamash

When that happens, the code will print a columns of the file somefile.csv. The nu- How are columns recognized? By de-
new line instead of a tab (i.e., start a merical and statistical grouping opera- fault, datamash assumes they are sepa-
new row of the table instead of adding tions include both self-explaining func- rated by single tabs. If they are delim-
another column). You may modify the tions such as sum, min, max, or mean, and ited by other white spaces, or combi-
code in Listing 1 as desired, including many obscure (to me) statistical ones. nations of them, you must say so with
generating text instead of numbers, by There are also operations such as coun- the --whitespace or -W options. In that
adding arrays of strings and then using tunique that count the number of unique case, leading white spaces are ignored.
the counter, or another random number, values in a column. To learn about all Any other column delimiter, for exam-
as an index to load elements of those the possible grouping operations, please ple, a slash, must be declared with -t /
arrays. consult the man page or the online docu- or --field-separator=/.
mentation on the website. Another thing you need to know
The datamash Way Finally, datamash has a “primary” cat- about datamash is inside this short file
Datamash processes data organized in egory of five very important meta-opera- of space-separated floating numbers:
columns and rows (i.e., lines of text) by tions, which must be listed first when
calling functions that perform “opera- used. Of these, the one you will likely #> cat floating.csv

tions” on every element (field) of the use more often is called groupby (-g for 34.2 35.3

column or columns they are told to use. short). I will explain it in a moment, 14.9 -3.3

The datamash documentation divides leaving the others for last, after introduc- #> datamash -W sum 1 min 2 < floating.csv

the available operations into six catego- ing some other basic concepts and com- datamash: invalid numeric value in line U
ries. The simplest one, called “line-filter- mand-line options of datamash. 1 field 1: '34.2'

ing,” consists of the single operation


called rmdup that, as its name suggests, Data Parsing This request to calculate the sum of col-
removes duplicate lines. Datamash can manage both contiguous umn 1 and the minimum value in col-
Most of datamash’s data processing ranges of columns, which are declared umn 2 failed because datamash wants
functions are classified in other catego- joining their extremes with a dash, or dots, not commas, as decimal-point
ries that can work “per-line” or by any random combination of columns characters. To make datamash happy,
“grouping.” I initially found that choice passed as a comma-separated list: use the tr command to translate all dots
of terms (even though I honestly cannot to commas:
suggest better alternatives) a bit confus- #> cat sample-file.csv | datamash max 7,U

ing. In datamash, per-line operations are 2,5 #> cat floating.csv | tr '.' ',' | dataU

those that, for every row of data, output #> cat sample-file.csv | datamash max 3-7 mash -W sum 1 min 2

one new value for every field of that line 49,1 -3,3

whose column was selected when data- The first command prints the maximum
mash was launched. Per-line operations values of columns 7, 2, and 5 in that The -C or --skip-comments option
can do both string and number order, while the second returns the four makes datamash ignore lines that start
processing. maximums of columns 3 to 7. Please no- with hashes or semicolons. Comments
You may, for example, call functions tice that if you want an operation done in other formats (e.g., lines starting
such as dirname, basename, and barename on all the columns of a file you must ex- with two slashes such as in the C lan-
to get the corresponding parts of a file plicitly declare the whole range. If a file guage) may be hidden from datamash
path or getnum to extract components has 23 columns, for example, and you by prefixing them with a hash with the
such as 753.4 from strings such as some- need to know the maximum value in sed command:
num753.4. If the data is numeric, you may each of them, you should enter:
among other things ask datamash to cal- #> cat file-with-c-style-comments.U

culate several types of checksum, en- #> datamash max 1-23 < file-with-23-U csv | | sed -e 's|^/?|# /?|' | dataU

code or decode numbers in base 64, or columns.csv mash -C ...

round them.
All the “grouping” operations instead Some operations have additional syntax Header lines with column labels
return just one value for every column because they either require a parameter, greatly increase the readability of both
they are told to process or for each part or must combine different columns to the input files as well as datamash’s
(more on this soon) of the same column. produce one result: output. If the first line of a file contains
For example, a command like labels for its columns, as in this sam-
#> datamash perc:40 5 < input-file.csv ple file from the datamash
#> datamash max 3 min 1 mean 5 < someU #> datamash pcov 4:6 < input-file.csv documentation
file.csv

4300 23 304,3 Here, datamash’s first call returns the #> cat /usr/share/doc/datamash/examples/U

40th percentile of column 5, and the scores_h.txt | head -3

would make datamash print the maxi- second returns the covariance (i.e., Name Major Score

mum (4300), minimum (23), and mean joint variability) of the values in col- Shawn Arts 65

(304,3) values of the third, first, and fifth umns 4 and 6. Marques Arts 58

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 49


IN-DEPTH
datamash

then datamash will recognize and accept input data – a tab or whatever was de- or rearrange columns of data before
those labels as column names, if given clared with the -t or -W switches. If you performing any of the operations de-
the --header-in option. You may issue want a different column delimiter, how- scribed so far. Consider the file in List-
commands such as ever, you can set it with: ing 2, which lists the number of users
of several operating systems in differ-
#> cat scores_h.txt | datamash --headerU #> echo "2,4 3,7 112,88" | datamash -W U ent places.
-in min Score ceil 1-3 '--output-delimiter=|' With datamash, you can find the total
14 3|4|113 number of users of each operating sys-
tem as follows:
to find that the minimum score in the To add headers, use --header-out. Cou-
whole file is 14. There is a trap here, pled with --header-in, that option will #> cat users.tsv | datamash -s -g 2 sum 1

however. Consider this case, where the use the same headers present in the freebsd 2981

question asked to datamash using the input file. Otherwise it will print the op- linux 222743

--header-in option seems to be “what is erations corresponding to each column: unix 29437

the sum of the numbers in the column


with label 1 (the middle column)?”: echo "2,4 3,7 112,88" | datamash -W U With respect to the previous example,
floor 3 ceil 2 --header-outfloorU what’s really new here is the primary op-
#> cat bad-headers.csv (field-3) ceil(field-2) eration called -g or --groupby.
0 1 2 112 4 As its name implies, this operation
1,1 2,2 3,3 makes datamash partition all the rows of
7,1 5 4,9 As far as output formatting is concerned, data that have the same value in the col-
#> cat bad-headers.csv | datamash -W --U you also need to know about a limitation umn passed to groupby in as many sepa-
header-in sum 1 of the datamash setting for decimal rate groups, in order to perform the de-
8,2 precision: sired operation on each of those groups,
one at a time, and then assemble all the
In spite of being told to use the values in cat rounding.csv results.
the first line as column labels, datamash 1,89 In my example, -g 2 right before sum
summed the numbers in the first column 2,437 1 tells datamash to group the rows by
(1,1 and 7,1), instead of those (2,2 and 0,925 using the values in the second column
5) in the middle column that has the #> datamash -R 5 mean 1 < rounding.csv as keys and then to calculate for each
label 1 inside the data file. The reason is 1,75067 of those groups the sum of all its ele-
that the --header-in option does not ments in the first column. In order for
override the numeric indices, which The example above shows that you can this to work as expected, however, the
have a higher priority! The obvious solu- limit the number of decimal digits in the first thing to do is to sort all the rows
tion, because it also is a good practice in output with the -R (rounding) switch. on the same column, which is what
general, is to not label columns with nu- However, you cannot eliminate them the -s does.
meric indices. completely: Had I set -R to 0 to mean no The groupby operation is even more
decimals, datamash would have com- powerful than it may look from the
Output Formatting plained that 0 is not a valid value. Luck- first example because it can work on
On the output side, datamash can format ily, this limitation is also easy to fix, as I multiple columns. Take the following
its results in several ways, which are al- will show in another example. example: In addition to the “Eternal
most all mirror versions of the input
parsing options I just described. The first Let’s Group! Listing 3: Number of Events by Place
exception is the -f or --full command The real power #> cat cities.csv
switch, which prints the full line of input of datamash be- Rome Alabama 1987
data right before the result of any other comes evident Rome Georgia 2015
operation you asked datamash to per- whenever you Rome Illinois 1998
form. If you use --format=FORMAT instead need to combine Rome Alabama 2002
(see the man page Rome Iowa 2020
for details), you Listing 2: Number of Users Rome Alabama 2007
can print the out- #> cat users.tsv Rome Illinois 1974
puts in any way 1993 linux
supported by the #> cat cities.csv | datamash -t' ' -s -g 1,2 count 1 min 3
2981 freebsd
printf system
30940 linux
function.
389 linux Rome Alabama 3 1987
By default, the
29000 unix Rome Georgia 1 2015
output delimiter
189421 linux Rome Illinois 2 1974
for columns will
437 unix Rome Iowa 1 2020
be the same as the

50 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
datamash

City” in Italy, there are more than a lines, words, and characters, in following Explanation of Listing 4
dozen places named Rome just in the order: Listing 4 shows the several steps I took
United States. Imagine that someone to compose the datamash-based com-
recorded every time a certain event, be #> wc testfile.md mand that would do just what I
it the birth of quadruplets or a visit 33 407 3608 testfile.md needed. To understand it, please note
from the US president, took place in
those US locations. I can use groupby to Listing 4: Print Summary Statistics of Three Blogs
ask datamash to tell me how many of ONE #> find . -type f -name "*.md" | xargs wc
these events have happened in each of
those places, including when the first
32 284 2359 ./stop/google-is-microsoft-2.0.md
one happened as shown in Listing 3.
68 532 3579 ./stop/spying-is-over.md
Listing 3 gives the desired answer
253 4151 27074 ./freesw/nextcloud-16-review.md
thanks to the only substantial differ-
48 411 3184 ./stop/ready-facebook-one.md
ence between this invocation of data-
...
mash and the previous one: This time,
I told datamash to group and process
as one key the combination of two col- TWO #> find . -type f -name "*.md" | xargs wc \

umns (-g 1,2). That’s why it could cal- | sort -t / -k 2

culate that in Rome Alabama the event


happened three times, starting in 1987. 253 4151 27074 ./freesw/nextcloud-16-review.md

Another thing that is important to 32 284 2359 ./stop/google-is-microsoft-2.0.md

learn from the last two examples is that 68 532 3579 ./stop/spying-is-over.md
the groupby operation always prints first 48 411 3184 ./stop/ready-facebook-one.md
the column, or combination of columns, ...
that it used as keys. What if you needed
to have those columns in some other po- THREE #> find . -type f -name "*.md" | xargs wc \
sition? The answer, as I will show | sort -t / -k 2 \
shortly, is to pass the output of data-
| tr / " "
mash to some other tool, such as AWK,
sed, or even a second invocation of
202 1245 8267 . freesw ignore-threads-in-mailing-lists.md
datamash!
196 1978 14890 . freesw odf-slideshows-from-plain-text-files.md

Mixed Text/Data Processing 39 401 2793 . stop nuclear-batteries-yay.md

47 537 3616 . stop obstacles-to-open-data.md


By now, you have already seen that,
while the main focus of datamash is 24 159 1571 . tips records-usa-can-be-proud-of.md

numbers and numeric operations, it 45 449 3548 . tips teacher-adept-at-firearms.md

can also process textual values. You ...

can see more of its capabilities for han-


dling textual values by looking for the FOUR #> find . -type f -name "*.md" | xargs wc \
“scores” and “passwords” examples in | sort -t / -k 2 \
the online manual [1]. Here I present a | tr / " " \
slightly more complicated example of | datamash -W groupby 5 mean 1 mean 2 mean 3 min 2 max 2
the same capabilities, based on a per-
sonal need. freesw 95,1 937,9 6578,0 26,0 4151,0
Among other things, I manage three
stop 58,6 610,6 4401,6 33,0 5657,0
blogs whose posts are archived in my
tips 41,7 285,6 2327,2 58,0 4262,0
computer as plaintext files with Mark-
...
down syntax, inside three folders
named stop, freesw, and tips, which
FIVE #> find . -type f -name "*.md" | xargs wc \
are shortcuts for the real names of the
| sort -t / -k 2 \
blogs.
For several reasons, it need to regu- | tr / " " \

larly check some statistics about those | datamash -W groupby 5 mean 1 mean 2 mean 3 min 2 max 2 \

blogs, including the minimum, maxi- | datamash basename 1 trunc 2-6

mum, and average number of words of


their posts. I do that by preprocessing freesw 95 937 6578 26 4151
and then passing to datamash the out- stop 58 610 4401 33 5657
puts of the Unix command wc that, when tips 41 285 2327 58 4262
given a file, prints out just its number of

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 51


IN-DEPTH
datamash

that I prefixed the shell prompts with support multiple field delimiters, I con- operation generates an error message if
numbers in capital letters to make the verted the slashes to spaces with the tr the rows of the current file do not have
explanation easier to follow. I also cut command. Now all the columns have exactly the same number of arguments
the output of each command to just a the same delimiter, and the blog names (Listing 5).
few hand-picked lines, for readability are always in the fifth column. This is Inside a shell script, you may auto-
and brevity. something datamash can handle! mate the check and generate more syn-
ONE: This finds all the Markdown FOUR: I can finally add datamash to thetic error messages as follows
files in the root directory of my blogs the pipe, first setting the column sepa-
and, through the xargs command, rator to whitespaces (-W), and then ask- datamash check < bad.csv || die "this U
passes them all to wc. The output has ing to group on column 5 (the blog file has an invalid structure"

all the data I need, but it is not sorted name), in order to first print the mean
by blog name (the freesw entry should values of line, words, and character because (without going into details) the
be first, not third!). This is the way numbers of all the posts of each blog, command after the || operator will only
the find command and Linux filesys- followed by their minimum and maxi- be executed if the datamash check fails.
tems work, but datamash can only mum number of words. At this point, The control can be even more precise,
group rows presorted by the grouping the only thing left is to get rid of the because check accepts two optional argu-
key. As far as I understand, the sort- decimal digits. ments (lines and columns) and will fail
ing that would be needed here is be- What I actually got (even if I left only unless the target file has exactly that
yond datamash’s capabilities – no the first digits in Listing 1) were num- number of lines and columns.
problem though. bers like 937,91039, which are just con-
TWO: I piped the output of the initial fusing. For my purposes, truncating all Transformations
command to the sort utility, telling it to those numbers to integers would be The last major type of operation that
sort on the second field (-k 2), with / as more than adequate. The problem is, datamash can perform is what I would
field separator. This sorted the posts by how can I do it if, as explained above, I call the data or table “transformations”
blog, as needed, so on to the next cannot give the -R option a null value? provided by the primary functions called
problem. FIVE: Here is the solution: Pipe the reverse, transpose, and crosstab.
THREE: The find command prints the output of datamash to datamash, telling The first one reverses, unsurprisingly,
whole path to a file, but the only part I it to truncate all the numeric fields, the positions of all columns (Listing 6).
need datamash to see is the blog name which are those in the columns with in- Combined with the cut command,
(i.e., freesw, stop, or tips). This is a dexes between 2 and 6! which extracts whatever combination of
problem because that part is delimited columns you want, datamash’s reverse
by slashes, not spaces like the previous Quality Control operation makes it very easy to rear-
columns. Because datamash does not Remember I said that datamash has not range columns in a text file any way
just groupby, but a you desire.
Listing 5: check Operation Error Message whole category of Compared to reverse, transpose some-
$ cat bad.csv
“primary opera- how does a mirror operation, because it
A 1 ww
tions”? Time to swaps rows with columns (Listing 7).
talk about the It is possible to reverse or transpose
B 2 xx
other four, which files even if their lines do not have all
C 3
add to datamash the same number of columns by add-
D 4 zz
two different ca- ing the --no-strict option. In those
$ datamash check < bad.csv
pabilities that I cases, you may even fill the empty
like a lot, the first fields with a string of your choice
datamash: check failed: line 3 has 2 fields (previous line
being a sort of using --filler="FILLER STRING HERE".
had 3)
quality control. The crosstab operation, which exposes
fail
The check the relationships between two columns,

Listing 6: Reverse Column Positions Listing 7: Transpose Rows with Columns


#> cat eight-columns-file.tsv #> cat 3-columns-file.csv

768 907 240 539 644 890 380 344 a 1 OK

901 646 534 653 18 653 14 547 b 5 OK

257 808 802 650 139 450 19 113 c -1 OK

d -20 NOK

#> cat eight-columns-file.tsv | datamash -W reverse

#> cat 3-columns-file.csv | datamash -W transpose

344 380 890 644 539 240 907 768 a b c d

547 14 653 18 653 534 646 901 1 5 -1 -20

113 19 450 139 650 802 808 257 OK OK OK NOK

52 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
datamash

is the datamash version of pivot tables. At because it can count how many rows open source gems that may be a huge
first sight, crosstab may seem to be just have the same values in a given pair of time saver for more than a few users. If
another way to group multiple columns, columns, as shown in Listing 8. you have tabular data of any type, try it,
In Listing 8, datamash indeed tells that alone or as a lightweight but still power-
Listing 8: crosstab Example a and x appear side-by-side two times in ful sidekick of VisiData [2]. You won’t re-
$ cat input.txt
the input file. If this were the whole story, gret it! QQQ
a x 3
crosstab would be just another version of
a y 7
grouping that displays its findings with a
matrix instead of a list. Info
b x 21
The added value of crosstab is that it [1] GNU datamash:
a x 40
can show using the same format the re- www.gnu.org/software/datamash/
sult of many other grouping operations, [2] “A Command-Line Data Visualization
$ datamash -s crosstab 1,2 < input.txt
not just the number of times each pair Tool” by Marco Fioretti, Linux Maga-
x y zine, issue 277, December 2023, pp.
appears. This is evident in these two
a 2 1 40-45
examples from the datamash manual
b 1 N/A
(Listing 9), where crosstab is used to
show first the Author
Listing 9: crosstab Shows Sums and Values sums and then the Marco Fioretti (https://mfioretti.substack.
#> datamash -s crosstab 1,2 sum 3 < input.txt
unique values com) is a freelance author, trainer, and
x y
from the third col- researcher based in Rome, Italy, who has
a 43 7
umn, for any been working with free/open source
b 21 N/A
combination of software since 1995,
values from the and on open digital
first two. standards since 2005.
#> datamash -s crosstab 1,2 unique 3 < input.txt
Marco also is a board
x y

a 3,40 7
Conclusion member of the Free
Datamash is one of Knowledge Institute
b 21 N/A
those little-known (http://freeknowledge.eu).
IN-DEPTH
PyScript

Using Python in the browser

Snake Charmer
PyScript lets you use your favorite Python libraries on client-side summarize some of the strengths and
weakness that I’ve found while working
web pages. By Pete Metcalfe
with PyScript.

W
hile there are some great In this article, I will introduce PyScript Getting Started
Python web server frame- with some typical high school or univer- PyScript doesn’t require any special soft-
works such as Flask, sity engineering examples. I will also ware on either the server or client; all
Django, and Bottle, using
Python on the server side adds complex-
ity for web developers. To use Python
on the web, you also need to support
JavaScript on client-side web pages. To
address this problem, some Python-to-
JavaScript translators, such as JavaScrip-
thon, Js2Py, and Transcrypt, have been
developed.
The Brython (which stands for
Browser Python) project [1] took the
first big step in offering Python as an
alternative to JavaScript by offering a
Python interpreter written in JavaScript.
Brython is a great solution for Python
enthusiasts, because it’s fast and easy to
Photo by Godwin Angeline Benjo on Unsplash

use. However, it only supports a very


limited selection of Python libraries.
PyScript [2] offers a new, innovative so-
lution to the Python-on-a-web-page prob-
lem by allowing access to many of the Py-
thon Package Index (PyPI) repository li-
braries. The concept behind PyScript is a
little different. It uses Pyodide, which is a
Python interpreter for the WebAssembly
(Wasm) virtual machine. This approach
offers Python within a virtual environ-
ment on the web client. Figure 1: The main components of a PyScript web page.

54 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
PyScript

Inspect option and print the present time into a PyScript ter-
then click on the minal section (Figure 4).
Console heading. A more Pythonic approach to calling
Figure 3 shows a a PyScript function is available with the
very typical error: a @when API. The syntax for this is:
print() function
missing a closing <py-script>

quote character. from pyscript import when

# define id and action,


Calling # then next line is the function
PyScript @when("click", selector="#button1")

Functions def my_pyfunc():

In the previous ex- print("Button 1 pressed")

ample, PyScript was </py-script>

called just once,


which is similar to You can also use the @when function to re-
how a JavaScript fresh an HTML tag, which I cover in the
Figure 2: PyScript using the Python SymPy library. <script> block is ex- next section.
ecuted when it is
the coding is done directly on the web embedded within a web page’s <body> A Calendar Example
page. For PyScript to run, it needs three section. Now I’ll provide a calendar example
things (Figure 1): There are several ways to call a (Listing 2) that uses a button and
• a definition in the header for the PyScript function. You can use the tradi- PyScript to replace the contents of an
PyScript CSS and JS links, tional JavaScript approach of adding a HTML tag. To keep things simple, the
• a <py-config> section to define the function reference within a tag reference Python’s calendar output will be left as
Python packages to load, and as shown in the following button ASCII and an HTML <pre> tag will be
• a <py-script> section for the Python example: used (Figure 5).
code. The calendar page has Back and For-
In Figure 1, the <py-script> section uses <button py-click="my_pyfunc()" U ward buttons (lines 10-11) and a <pre>
terminal=true (the default) to enable id="button1">Call Pyscript</button> section (line 12).
Python print() statements to go directly In the <py-script> section, the when
to the web page. A little bit later, I’ll PyScript supports a wide range of ac- and calendar libraries are imported on
show you how to put PyScript data into tions. For the button, a click event is de- lines 15-17. These two libraries are part
HTML tags. fined with the py-click option, but other of the base PyScript/Python that is
Figure 2 shows the running web page. actions such as a double-click loaded into Pyodide, so a <py-config>
This math example performs the Python (py-dblclick) or a mouseover section is not needed.
SymPy simplify function on a complex (py-mouseover) event could also be Like calling PyScript functions, there
equation to reduce the equation to its added. are multiple ways to read and write web
simplest form. The pprint() (pretty Listing 1 shows a button click action content. PyScript has a built-in display()
print) function outputs the equation into that calls a function, current_time(), to function that is used to write to HTML
a more present-
able format on the
page’s py-terminal
element (the
black background
section shown in
Figure 2).
Debugging
code is always an
issue. The web
browser will
highlight some
general errors in
the PyScript
pages. To see
more detailed Py-
thon errors, right-
click on the page
and select the Figure 3: Debug PyScript with the browser’s Inspect option.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 55


IN-DEPTH
PyScript

tags (lines 20, 26, and 32). The syntax Listing 3 shows the code to import the without the user’s authorization. To
for the display() function is: alert and prompt libraries, then prompts allow PyScript to access a local file,
the user for their name, and finally dis- you need to do three key things. To
display(*values, target="tag-id", U plays an alert message with the entered start, you need to configure a page
append=True) name (Figure 6). with an <input type="file"> tag. To
call a file-picker dialog with a CSV fil-
The *value can be a Python variable or Reading and Plotting a ter, enter:
an object like a Matplotlib figure. Local CSV
The @when function (lines 22 and 28) File Listing 1: Button Click Action
connects the Back and Forward button For a final, more <!DOCTYPE html>
clicks to the functions back_year() and challenging exam- <html lang="en">
forward_year(). ple, I’ll use <head>
PyScript to read a <title>Current Time</title>
PyScript with JavaScript local CSV file into
<link rel="stylesheet" href="https://pyscript.net/
Libraries a pandas latest/pyscript.css" />
In many cases you’ll want to use Java- dataframe and
<script defer src="https://pyscript.net/latest/pyscript.
Script libraries along with PyScript. For then use Matplot- js"></script>
example, you might want to include Ja- lib to plot a bar </head>
vaScript prompts or alert messages for chart (Figure 7).
your page. To access a JavaScript li- For security
<body>
brary, add the line: reasons, web
<h1>Py-click to call a Pyscript Function</h1>
browsers cannot
<!-- add py-click into the button tag -->
from js import some_library access local files
<button py-click="current_time()" id="get-time"
class="py-button">Get current time</button>

<py-script>

import datetime

# this function is called from a button

def current_time():

print( datetime.datetime.now())

</py-script>

</body>

</html>
Figure 4: Button click to call a PyScript function.

Listing 2: PyScript Yearly Calendar


01 <html> 19 thisyear = 2023

02 <head> 20 display(calendar.calendar(thisyear), target="calzone" )

03 <link rel="stylesheet" href="https://pyscript.net/ 21


latest/pyscript.css" /> 22 @when("click", selector="#btn_back")
04 <script defer src="https://pyscript.net/latest/ 23 def back_year():
pyscript.js"></script>
24 global thisyear
05 <title>Pyscript Calendar Example</title>
25 thisyear -= 1
06 </head>
26 display(calendar.calendar(thisyear),
07 target="calzone", append=False )
08 <body> 27
09 <h1>Pyscript Calendar Example</h1> 28 @when("click", selector="#btn_forward")
10 Move Years: 29 def forward_year():
11 <button id="btn_back"> Back </button> 30 global thisyear
12 <button id="btn_forward"> Forward </button> 31 thisyear += 1
13 <pre id="calzone"></pre> 32 display(calendar.calendar(thisyear),
14 target="calzone", append=False )

15 <py-script> 33

16 from pyscript import when 34 </py-script>

17 import calendar 35 </body>

18 36 </html>

56 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
PyScript

Figure 6: You can use JavaScript libraries in PyScript.

Listing 3: JavaScript Libraries with PyScript


<py-script>

# Use a JS library to show a prompt and alert message

from js import alert, prompt

# Ask your name, then show it back

name = prompt("What's your name?", "Anonymous")

alert(f"Hi:, {name}!")

</py-script>

Listing 4: Defining an Event Listener


from js import document

from pyodide.ffi.wrappers import add_event_listener


Figure 5: PyScript calendar with @when functions.

<input type="file" id="myfile" U allows the data to # Set the listener to look for a file name change
name="myfile" accept=".csv"> be passed into a e = document.getElementById("myfile")
pandas dataframe
add_event_listener(e, "change", process_file)
Next, you must define an event listener (lines 36 and 37).
to catch a change in the <input> file. For Line 38 outputs
this step, two libraries need to be im- the dataframe to a py-terminal Line 47 sends the Matplotlib figure to
ported, and an event listener needs to element: the page’s <div id="lineplot">
be configured as shown in Listing 4. element:
Finally, you need to import the Java- print("DataFrame of:", f.name, "\n",df)

Script FileReader and the PyScript pyscript.write('lineplot',fig)

asyncio libraries as follows: This example only presents bar charts


for the first two rows of data (lines 42- Although somewhat complex, this ex-
from js import FileReader 45), but the code would be modified to ample only took 30 lines of Python
import asyncio do line plots for multiple rows of data. code. Good future projects could

The FileReader object is used to read in


the CSV file’s content. The asyncio li-
brary creates background event process-
ing to allow functions to complete suc-
cessfully without timing or delay issues.
Listing 5 shows the full code for read-
ing and plotting a local CSV file. In List-
ing 5, pay particular attention to:
• defining a <py-config> section for the
pandas and Matplotlib (PyPI) libraries
(lines 9-11) and
• creating an async function
(process_file(event)).
Note, the async function is launched
from the add_event_listener (line 51)
when the user selects a file.
The CSV file is read into a variable
(line 34), and then the StringIO function Figure 7: Read and plot a local CSV file as a bar chart.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 57


IN-DEPTH
PyScript

include adding options for sorting, nice that these PyScript pages don’t re- addition, I often got tripped up with Py-
grouping, and customized plots. It’s quire Python on the client machine. thon indentation when I was cutting and
important to note that PyScript can While working with PyScript, I found pasting code. Overall, however, I was very
also be used to save files to a local two issues. The call-up is very slow (espe- impressed with PyScript, and I look for-
machine. cially compared to Brython pages). In ward to seeing where the project goes. Q Q Q

Summary Author Info


Using Python libraries such as pandas, You can investigate more neat projects [1] Brython: https://brython.info/
SymPy, or Matplotlib on a client page by Pete Metcalfe and his daughters at
can be a very useful feature. It’s also https://funprojects.blog. [2] PyScript: https://pyscript.net/

Listing 5: PyScript CSV File to Bar Chart


01 <!DOCTYPE html> 29 # Process a new user selected CSV file

02 <html lang="en"> 30 async def process_file(event):


03 <head>
31 fileList = event.target.files.to_py()
04 <title>Pyscript CSV to Plot</title>
32 for f in fileList:
05 <link rel="stylesheet" href="https://pyscript.net/
latest/pyscript.css" /> 33 data = await f.text()

06 <script defer src="https://pyscript.net/latest/ 34 # the CSV file is read as large string


pyscript.js"></script>
35 # use StringIO to pass info into Panda dataframe
07 <title>Local CSV File to Matplotlib Chart</title>
36 csvdata = StringIO(data)
08 <!-- Include the Pandas and Matplotlib packages -->
37 df = pd.DataFrame(pd.read_csv(csvdata, sep=","))
09 <py-config>

10 packages = [ "pandas", "matplotlib" ] 38 print("DataFrame of:", f.name, "\n",df)

11 </py-config> 39

12 </head> 40 # create a Matplotlib figure with headings and


13 <body> labels

14 41 fig, ax = plt.subplots(figsize=(16,4))
15 <h1>Pyscript: Input Local CSV File and Create a Bar
42 plt.bar(df.iloc[:,0], df.iloc[:,1])
Chart</h1>
43 plt.title(f.name)
16 <label for="myfile">Select a CSV file to graph:</
label> 44 plt.ylabel(df.columns[1])
17 <input type="file" id="myfile" name="myfile" accept=".
45 plt.xlabel(df.columns[0])
csv"><br>
46 # Write Mathplot figure to div tag
18

19 <div id="lineplot"> </div> 47 pyscript.write('lineplot',fig)

20 <pre id="print_output"> </pre> 48

21 <py-script output="print_output"> 49 # Set the listener to look for a file name change
22 import pandas as pd
50 e = document.getElementById("myfile")
23 import matplotlib.pyplot as plt
51 add_event_listener(e, "change", process_file)
24 from io import StringIO
52
25 import asyncio
53 </py-script>
26 from js import document, FileReader

27 from pyodide.ffi.wrappers import add_event_listener 54 </body>

28 55 </html>

QQQ

58 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
Programming Snapshot – Go CGI Scripting

Track your weight with a CGI script and Go

Scales,
Well?
Mike Schilli steps on the scale every week and records
his weight fluctuations as a time series. To help monitor
his progress, he writes a CGI script in Go that stores the
data and draws visually appealing charts. By Mike Schilli

C
apturing datapoints, adding measured values via HTTPS like an API, imported engineer from Germany. At
them to a time series, and formats the time series generated from the time, we did everything live on a
showing values over time them into an attractive chart, and sends single server without any form of
graphically is usually the do- the results back to the browser in PNG safety net. A CGI script at the top of
main of tools like Prometheus. The tool format? Let’s find out. the portal page displayed the current
retrieves the status of monitored systems Figure 1 shows the graph of a time se- date. However, this caused the (only!)
at regular intervals and stores the data as ries that outputs my weight in kilo- server to collapse under the load of
a time series. If outliers occur, the mes- grams over the past few years (possibly what was quite a considerable number
senger of the gods alerts its human to embellished for this article) as a chart of users, because of the need to launch
the fact. Viewing tools such as Grafana in the browser after pointing it to the a Perl interpreter for every call. I
display the collected time series in dash- URL on the server. The same CGI script brought the machine back to life with
boards spread over the last week or year also accepts new incoming data. For ex- a compiled C program that did the
as graphs, if so desired, so that even se- ample, if my scale shows 82.5 kilograms same job but started faster. Later on,
nior managers can see at a glance what’s one day, calling persistent environments such as mod_
going on in the trenches. perl came along and made things a
However, my el cheapo web host curl '.../cgi/minipro?add=82.5&apikey=U thousand times faster.
won’t let me install arbitrary software <Key>'

packages for this purpose on my rented All Inclusive


virtual server. Plus, maintaining such will add the value with the current date Today, the CGI protocol is frowned upon
complicated products with their continu- to the time series, now permanently because a script might tear open a secu-
ous updates would be too time consum- stored on the server. If I replace add=... rity hole in the server environment, and
ing for me, anyway. However, there is a in the URL with chart=1, the script will the startup costs of an external program
’90s-style CGI interface on the web return the chart with all the values fed in that launches for every incoming request
server. How hard could it be to write a so far. are immense as user numbers increase.
CGI program in Go that receives But of course, for my weight barometer,
Jurassic Tech where the server will field maybe two re-
Lead Image © mathias the dread, photocase.com

Author The CGI protocol is bona fide dinosaur quests per day, this design is justifiable.
Mike Schilli works as a technology from the heady ’90s of the In a scripting language such as Python,
software engineer in the last century. At the time, the first dy- such a mini project would be imple-
San Francisco Bay Area, namic websites came into fashion after mented in next to no time.
California. Each month users, having acquired a taste for more But I like the challenge of bundling
in his column, which has than static HTML, began to crave cus- adding values and displaying the chart
been running since 1997, tomized content. into one single static Go binary that has
he researches practical applications of It’s a time I remember very well: I no dependencies. Refreshing various Py-
various programming languages. If you was working at AOL back then, tasked thon libraries every so often by hand
email him at [email protected] with freshening up AOL’s website in with pip3 seems like too much trouble.
he will gladly answer any questions. San Mateo, California, as a freshly Once compiled – even if cross-compiled

60 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
Programming Snapshot – Go CGI Scripting

in line 19 parses the incoming request


and then sends the response back to
the server.
To do this, it expects a handler func-
tion as a parameter. The handler func-
tion, defined in line 10, in turn, ex-
pects a writer for the output and a
reader for the incoming request data as
parameters. Calling the Query() library
function on the incoming request URL
inside the handler returns a map that
assigns the names of the incoming CGI
parameters to their values. The for
loop starting in line 14 iterates over all
the entries in the hashmap and outputs
all incoming form parameters and their
values to the w writer.

Static Forever
Compiling and linking the Go code from
Figure 1: The author’s weight fluctuations over the years. Listing 1 creates a binary; simply copy
this into the web server’s cgi/ directory
on another platform – a statically linked environment variable to the URL of the and make it executable. If the web server
Go program will run until the end of request, among other things, and calls is configured to call the cgi-test pro-
time. Even if the web host were to up- the associated program or script. The gram in case of an incoming request to
grade the Linux distro to a new version script then retrieves the information re- cgi/cgi-test, it will return the script’s
with libraries suddenly disappearing as a quired to process the request from its output to the requesting web client’s
result, the all-inclusive Go binary will environment variables. In case of a GET browser. Figure 2 shows the results from
still soldier on. request, for example, you only need the point of view of the user submitting
the URL in REQUEST_URI; its path also the request in Firefox.
Getting Started with CGI includes all the CGI form parameters if So far, so good – but how do you ac-
If a web server determines that it present. As a response to the inquiring tually compile Listing 1? After all, the
needs to respond to a request with an browser, the script then simply uses idea is to create a binary that runs on
external CGI script based on its config- print() to write the answer to stdout. the web host’s Linux distro, which
uration, it sets the REQUEST_URI The web server picks up the text may be incompatible with the build
stream and sends environment because it might be miss-
Listing 1: cgi-test.go it back to the re- ing some shared libraries present on
01 package main
questing client. the web server. Go binaries typically
02
Listing 1 shows only need an acceptable version of the
a minimal CGI host system’s libc. What to do? Docker
03 import (
program in Go. It to the rescue! My web host uses
04 "fmt"
uses the standard Ubuntu 18.04, which means that the
05 "net/http"
net/http/cgi li- Dockerfile in Listing 2 sets up a com-
06 "net/http/cgi"
brary, whose patible environment with this base
07 )
Serve() function image on my build host.
08

09 func main() {

10 handler := func(w http.ResponseWriter, r *http.Request) {

11 qp := r.URL.Query()

12 fmt.Fprintf(w, "Hello\n")

13

14 for key, val := range qp {

15 fmt.Fprintf(w, "key=%s=%s\n", key, val)

16 }

17 }

18

19 cgi.Serve(http.HandlerFunc(handler))

20 }
Figure 2: The Go program in Listing 1 as a CGI script.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 61


IN-DEPTH
Programming Snapshot – Go CGI Scripting

However, Ubuntu’s golang package Well Prepared later, Go only needs to compile the
version is almost always woefully out of To compile Go sources, the Go compiler sources locally and link everything to-
date; of course, it’s not even remotely often needs to pull the source code of gether. This literally takes just a few
usable on the fairly ancient Ubuntu dis- included packages and compile it before seconds. That’s what I call putting the
tro running on the web hoster’s box. But linking the final binary. A Docker image fun back into developing and
the Dockerfile can easily work around without those dependencies installed troubleshooting!
this; line 7 fetches a tarball with a very will dawdle around in the preparation The Makefile in Listing 3 assembles
recent Go 1.21 release off the web and phase for minutes at a time during each the image under the docker target (start-
drops its contents into the root directory build run. It will repeat the process time ing in line 9) and assigns it the cgi-test
of the build environment. Add to that and time again for every single minor tag when you run make docker. To com-
some tools like Git (Go uses Git to fetch change to the source code. To speed up pile the source code, you need to call the
GitHub packages) and make for the build, this phase, line 11 in Listing 2 copies remote target (starting in line 5) later.
and, presto, you have yourself a Fran- the Go sources for this project into the This will start a container with docker
kenstein distro ready to build a binary Docker image, and go mod tidy in line run and mount the /build directory in-
for the web host’s environment. 12 precompiles everything. When a side onto the current directory on the
container based host. This means that the generated bi-
Listing 2: Dockerfile on this image is nary within the container will be easily
01 FROM ubuntu:18.04
then launched accessible from outside later.
02 ENV DEBIAN_FRONTEND noninteractive

03 RUN apt-get update


Listing 3: Makefile.cgi-test
01 DOCKER_TAG=cgi-test
04 RUN apt-get install -y curl
02 SRCS=cgi-test.go
05 RUN apt-get install -y vim make
03 BIN=cgi-test
06 RUN apt-get install -y git
04 REMOTE_PATH=some.hoster.com/dir/cgi
07 RUN curl https://dl.google.com/go/go1.21.0.linux-amd64.
tar.gz >go1.21.0.linux-amd64.tar.gz 05 remote: $(SRCS)

08 RUN tar -C /usr/local -xzf go1.21.0.linux-amd64.tar.gz 06 docker run -v `pwd`:/build -it $(DOCKER_TAG) \

09 ENV PATH="${PATH}:/usr/local/go/bin" 07 bash -c "go build $(SRCS)" && \

10 WORKDIR /build 08 scp $(BIN) $(REMOTE_PATH)

11 COPY *.go *.mod *.sum /build 09 docker:

12 RUN go mod tidy 10 docker build -t $(DOCKER_TAG) .

Listing 4: minipro.go
01 package main 27 }
02 28
03 import ( 29 if len(params["chart"]) != 0 {
04 "fmt" 30 points, err := readFromCSV()
05 "net/http" 31 if err != nil {
06 "net/http/cgi" 32 panic(err)
07 "regexp" 33 }
08 ) 34 chart := mkChart(points)
09 35 w.Write(chart)
10 const CSVFile = "weight.csv" 36 } else if len(params["add"]) != 0 {
11 const APIKeyRef = 37 sane, _ := regexp.MatchString(`^[.\d]+$`,
"3669d95841f6d20ff6a5067a2f2919db4fca6e82" params["add"])
12 38 if !sane {
13 func main() { 39 fmt.Fprintf(w, "Invalid\n")
14 handler := func(w http.ResponseWriter, r *http.Request) { 40 return
15 qp := r.URL.Query() 41 }
16 params := map[string]string{} 42
17 for key, val := range qp { 43 err := addToCSV(params["add"])
18 if len(val) > 0 { 44 if err == nil {
19 params[key] = val[0] 45 fmt.Fprintf(w, "OK\n")
20 } 46 } else {
21 } 47 fmt.Fprintf(w, "NOT OK (%s)\n", err)
22 48 }
23 apiKey := params["apikey"] 49 }
24 if apiKey != APIKeyRef { 50 }
25 fmt.Fprintf(w, "AUTH FAIL\n") 51 cgi.Serve(http.HandlerFunc(handler))
26 return 52 }

62 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
Programming Snapshot – Go CGI Scripting

host. Line 4 uses REMOTE_PATH to specify variable is set to true; if not, line 40
its address. terminates the request and returns an
error message.
No Messing Around
But that’s enough messing around with Nicely Done
our test balloon. The actual CGI program To see a chart of the time series of values
that generates new values for the time fed in so far, you just set the CGI chart
series and later displays them graphi- parameter in the request to an arbitrary
cally goes by the name of minipro and value. In response, the section starting in
can be found in Listing 4. It uses the add line 29 of Listing 4 uses mkChart() to cre-
form parameter to accept new weight ate a new chart file in PNG format (see
measurements from the user via the CGI Listing 5) and calls w.Write() to return
interface and stores these measurements the chart’s binary data to the requesting
in the weight.csv CSV file on the server browser in line 35. Fortunately, the net/
with the timestamp for the current time. http/cgi library is smart enough to set
This is done by the addToCSV() function the introductory HTTP header to Con-
starting in line 43. tent-Type: image/png when it examines
In order to block Internet randos from the first few bytes of the stream and
banging on the interface, the CGI pro- finds sequences there that point to a
gram requires an API key; this string is PNG image.
hard-coded in line 11. The requesting Listing 5 takes care of managing the
API user attaches the secret to the re- CSV file. Its content consists of the float-
quest as the CGI apikey parameter. The ing-point values of the weight measure-
program on the server will only continue ments, each of which is accompanied by
processing the request if the key matches a timestamp in epoch format after a
Figure 3: The weight measure- the hard-coded value; otherwise, it will comma in each line. Figure 3 shows
ments as floating-point values stop at line 25. some of the stored data in the file.
with timestamps. Because CGI parameters cannot be
trusted in general, it makes sense to Guaranteed Write
The actual build process is started by check their validity with regular ex- In Listing 5, the addToCSV() function
the shell command in line 7, which calls pressions. This is why line 37 sniffs starting in line 10 has the task of accept-
go build. If this works without error, a out the add parameter to see if the ing new measurements. It opens the CSV
secure shell via scp finds the final binary string really looks like a floating-point file in O_APPEND mode; this means that
in the current directory (but outside the number (i.e., if it exclusively consists the fmt.Fprintf() write function in line
container) and copies it onto the target of digits and periods). If so, the sane 18 will always append new values, with

Listing 5: csv.go
01 package main 20 }

02 21

03 import ( 22 func readFromCSV() ([][]string, error) {


04 "encoding/csv" 23 points := [][]string{}
05 "fmt" 24
06 "os" 25 file, err := os.Open(CSVFile)
07 "time"
26 if err != nil {
08 )
27 if os.IsNotExist(err) {
09
28 return points, nil
10 func addToCSV(val string) error {
29 } else {
11 f, err := os.OpenFile(CSVFile,
30 return points, err
12 os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
31 }
13 if err != nil {
32 }
14 return 0, err
33 defer file.Close()
15 }
34
16 defer f.Close()
35 reader := csv.NewReader(file)
17
36 points, err = reader.ReadAll()
18 _, err = fmt.Fprintf(f, "%s,%d\n", val, time.Now().
Unix()) 37 return points, err

19 return 0, err 38 }

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 63


IN-DEPTH
Programming Snapshot – Go CGI Scripting

a current timestamp attached, to the end Unix format to an easily readable format in Listing 7 creates a new image with the
of the file. for the x-axis is handled automatically by minipro tag under the docker target using
This approach has a neat side effect. It the go-chart package from GitHub. Line 5 the same Dockerfile I used earlier. Once
ensures that, on POSIX-compatible Unix in Listing 6 fetches the package. this is done, make remote first starts the
systems, lines no longer than PIPE_BUF Line 32 creates a structure of the type container, mounts its working directory
(usually 4,096 bytes under Linux) are al- chart.TimeSeries from the datapoints in to hold the finished binary later, and
ways written in full, without another the xVals (timestamps) and yVals then starts the build and link process
process possibly interfering and ruining (weight measurements) array slices. with go build.
the line. In the present case, this is not Then, the chart.Chart structure from line If this works without errors, the secure
critically important, because there will 42 illustrates the structure in a chart. shell scp copies the binary to the web
be hardly any requests anyway, but on a The Render() function in line 49 creates host’s CGI directory, as set in REMOTE_
hard working web server where you can- the binary data of a PNG file, containing PATH. From there, a browser or curl script
not guarantee atomicity by default, the the diagram, both axes, and their leg- can then call its functions via the web
file would quickly become corrupt, un- ends from this. server, using add to add new datapoints
less you explicitly set a lock. To do so, line 48 creates a new write and then chart to graphically enhance
Conversely, readFromCSV() starting in buffer in the variable w. The chart’s Ren- and visualize the existing dataset. Q Q Q
line 22 reads the lines from the CSV file, der() function
and the standard encoding/csv Go library writes to the buf- Listing 7: Makefile.build
package takes apart the comma-sepa- fer, and Bytes() in DOCKER_TAG=minipro
rated entries. At the end, the function re- line 50 returns its
SRCS=minipro.go chart.go csv.go
turns a two-dimensional array slice of raw bytes to the
BIN=minipro
strings with two entries per line, for the caller of the func-
REMOTE_PATH=some.hoster.com/dir/cgi
value and timestamp. tion (i.e., the main
program) and ulti- remote: $(SRCS)

Spruce It Up with Graphics mately the inquir- docker run -v `pwd`:/build -it $(DOCKER_TAG) \

The mkChart() function starting in line 10 ing user’s browser. bash -c "go build $(SRCS)" && \
of Listing 6 fields this matrix of data- To assemble the scp $(BIN) $(REMOTE_PATH)
points and generates a graph like the one three source files
docker:
shown in Figure 1 from the data. The task into a static bi-
docker build -t $(DOCKER_TAG) .
of converting the timestamps from the nary, the Makefile

Listing 6: chart.go
01 package main 27 }
02 28 xVals = append(xVals, time.Unix(added, 0))
03 import ( 29 yVals = append(yVals, val)
04 "bytes" 30 }
05 "github.com/wcharczuk/go-chart/v2" 31
06 "strconv" 32 mainSeries := chart.TimeSeries{
07 "time" 33 Name: "data",
08 )
34 Style: chart.Style{
09
35 StrokeColor: chart.ColorBlue,
10 func mkChart(points [][]string) []byte {
36 FillColor: chart.ColorBlue.WithAlpha(100),
11 xVals := []time.Time{}
37 },
12 yVals := []float64{}
38 XValues: xVals,
13 header := true
39 YValues: yVals,
14
40 }
15 for _, point := range points {
41
16 if header {
42 graph := chart.Chart{
17 header = false
43 Width: 1280,
18 continue
44 Height: 720,
19 }
20 val, err := strconv.ParseFloat(point[0], 64) 45 Series: []chart.Series{mainSeries},

21 if err != nil { 46 }

22 panic(err) 47

23 } 48 w := bytes.NewBuffer([]byte{})

24 added, err := strconv.ParseInt(point[1], 10, 64) 49 graph.Render(chart.PNG, w)

25 if err != nil { 50 return w.Bytes()


26 panic(err) 51 }

QQQ

64 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
Teaming NICs

Bonding your network adapters


for better performance

Together
Combining your network adapters can speed up network performance – but a little more testing
could lead to better choices. By Adam Dix

I
recently bought a used HP Z840 WireGuard to connect to my home net- The Z840 has a total of seven net-
workstation to use as a server for a work from anywhere, so that I can back work interface cards (NICs) installed:
Proxmox [1] virtualization environ- up and retrieve files as needed and man- two on the motherboard and five more
ment. The first virtual machine (VM) age the other devices in my home lab. on two separate add-in cards. My sec-
I added was an Ubuntu Server 22.04 LTS WireGuard also gives me the ability to ond server with a backup WireGuard
instance with nothing on it but the Cock- use those sketchy WiFi networks that you instance has 4 gigabit NICs in total.
pit [2] management tool and the Wire- find at cafes and in malls with less worry Figure 1 is a screenshot from NetBox
Guard [3] VPN solution. I planned to use about someone snooping on my traffic. that shows how everything is con-
nected to my two switches and the ISP-
supplied router for as much redun-
dancy as I can get from a single home
network connection.

The Problem
On my B250m-based server, I had previ-
ously used one connection directly to the
ISP’s router and the other three to the
single no-name switch, which is con-
nected to the ISP router from one of its
ports. All four of these connections are
bonded with the balance-alb mode, as
you can see in the netplan config file
Photo by Andrew Moca on Unsplash

(Listing 1).
For those who are not familiar with
the term, bonding (or teaming) is
using multiple NIC interfaces to create
one connection. The config file in List-
ing 1 is all that is needed to create a
bond in Ubuntu. Since 2018 in version
Figure 1: Topology of my home network. 18.04, Canonical has included netplan

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 65


IN-DEPTH
Teaming NICs

as the standard utility for configuring save it, and run sudo netplan to apply personal preference, though that part
networks. Netplan is included in both the changes. If there are errors in your wouldn’t be required for this config.
server and desktop versions, and the config, netplan will notify you upon The last section is where you define
nice thing about it is that it only re- running the apply command. Note that what type of bonding you would like
quires editing a single YAML file for you will need to use spaces in this files to use, and I always choose to go with
your entire configuration. Netplan was (not tabs), and you will need to be balance-alb or adaptive load balancing
designed to be human-readable and consistent with the spacing. for transmit and receive, as it fits the
easy to use, so (as shown in Listing 1) In Listing 1, you can see that I have homelab use case in my experience
it makes sense when you look at it and four NICs and all of them are set to very well. See the box entitled “Bond-
can be directly modified and applied false for DHCP4 and DHCP6. This en- ing” for a summary of the available
while running. sures that the bond gets the IP ad- bonding options.
To change your network configuration, dress, not an individual NIC. Under The best schema for bonding in your
go to /etc/netplan, where you will see the bonds section, I have made one in- case might not be the best for me. With
any YAML config file for your system. If terface called bond0 using all four that in mind, I would recommend re-
you are running a typical Ubuntu Server NICs. I used a static IP address, and so searching your particular use case to see
22.04 install, it will likely be named I kept DHCP set to false for the bond what others have done. For most
00-installer-config.yaml. To change also. Since I configured a static IP ad- homelab use where utilization isn’t
your config, you just need to edit this dress, I also need to define the default constantly maxed out, I believe you will
file using nano (Ubuntu Server) or gateway under the routes section, and typically find that balance-alb is the
gnome-text-editor (Ubuntu Desktop), I always define DHCP servers as a best option.

Listing 1: Netplan Configuration File Bonding


network: Bonding options available for Linux systems include:
version: 2
• balance-rr – a round robin policy that sends packets in order from one to the next.
renderer: networkd This does give failover protection, but in my opinion, it isn’t as good for mixed-speed
ethernets: bonds as some of the other options because there is no “thought” put into which NIC
enp6s0: is sending packets. It’s simply round robin, one to the next to the next ad infinitum.
dhcp4: false • active-backup – simple redundancy without load balancing. You can think of this as
dhcp6: false having a hot spare. One waits till the other fails and picks up. This can add consis-
enp7s0:
tency if you have a flaky NIC or NIC drivers but otherwise is simply one NIC doing
nothing for most of the time. This would be a good option, though, if you have a
dhcp4: false
10G primary NIC to use all of the time and a 1G NIC for backup in case it fails.
dhcp6: false
• balance-xor – uses a hashing algorithm to give load balancing and failover protec-
enp2s4f0:
tion using an additional transmit policy that can be tailored for your application.
dhcp4: false
This option offers advantages but is one of the more difficult policies to optimize.
dhcp6: false
• broadcast – sends everything from everywhere. While that may sound effective, it
enp2s4f1:
adds a lot of noise and overhead to your network and is generally not recom-
dhcp4: false
mended. This is the brute force, shotgun approach. It offers redundancy but for
dhcp6: false most applications is wasteful of energy without necessarily offering a higher level of
bonds: consistency.
bond0: • 802.3ad – uses a protocol for teaming, which must be supported by the managed
dhcp4: false switch you are connecting to. That is its main pitfall, as it requires a switch that sup-
dhcp6: false ports it. With 802.3ad, you would create link aggregation groups (LAGs). This is con-
interfaces:
sidered the “right” way to do it by folks who can always afford to do things the
“right” way with managed switches. 802.3ad is the IEEE standard that covers team-
- enp6s0
ing and is fantastic if all of your gear supports it.
- enp7s0
• balance-tlb – adaptive load balancing; sends packets based on NIC availability but
- enp2s4f0
does not require a managed switch. This option offers failover and is similar to bal-
- enp2s4f1
ance-alb with one key difference: Incoming packets are simply sent to whatever NIC
addresses: [192.168.0.20/24] was last used so long as it is still up. In other words, this load balances on transmit
routes: but NOT on receive.
- to: default • balance-alb – the same as balance-tlb but also balances the load of incoming pack-
via: 192.168.0.1 ets. This gives the user failover as well as transmit and receive load-balancing with-
nameservers: out requiring a managed switch. For me, this is the best option. I have not tested to
addresses: [8.8.8.8, 1.1.1.1, see if there is a noticeable difference between balance-alb and balance-tlb, but I sus-
8.8.4.4] pect that for a home server and homelab use there won’t be. I would recommend
parameters: testing the difference between alb and tlb if using this in a production environment
mode: balance-alb
as there may be unintentional side effects to the extra work being done on the re-
ceive side in terms of latency of utilization.
mii-monitor-interval: 100

66 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


IN-DEPTH
Teaming NICs

Figure 2: Proxmox network configuration.

Findings After experimenting with the config- Results


What I discovered was that setting up uration, I eventually discovered it was Figures 3 and 4 show network speeds
Proxmox with a dedicated port for Wire- better not to put the VPN on a separate and ping times. You can see that by
Guard and the remaining ports bonded NIC but to use a single port for man- bonding the single NIC that was previ-
for all other VMs actually resulted in agement only and to team the other 6 ously dedicated to WireGuard into a
slower and less consistent speeds for Wi- NICs in my Proxmox server as that re- team with the other 5 NICs I was able
reGuard than what I had been getting on sulted in the best speed and consis- to achieve better ping times and also
my previous B250m-based machine with tency running WireGuard, regardless better speeds. More importantly, the
bonded NICs. This is something which I of the fact that all of my other VMs are WireGuard speeds were very consistent.
didn’t expect, but in retrospect, perhaps using that same bond. Figure 2 shows Across five runs, I only saw a variation
I should have. the configuration. You will see 10 NICs of 0.05Mbps maximum with the six
The initial plan for my new gear was in Figure 2, but three of them are not bonded NICs in Proxmox versus a varia-
to use one NIC for management only, running. This is an oddity of some tion of up to 0.45Mbps max in speed
one for WireGuard only, and the re- quad-port cards in Proxmox. Run the variance when using the dedicated NIC.
maining 5 NICs for all of my other VMs following command to reload the net- With my previous four NIC B250M
on the Proxmox server. My expectation work interface configuration on an setup, the consistency was in the mid-
was that having a dedicated NIC used hourly basis: dle at about 0.34Mbps variance, but the
only for the WireGuard VPN would speeds were about 0.2Mbps slower on
help me to realize faster speeds but ifreload -a average.
also more consistent speeds because
the VPN would be independent of my This command ensures I get all six up Conclusion
other VMs’ network performance. Al- and running, albeit with a “failure” Some of you are likely thinking yeah, of
though that would mean no redun- each time I ifreload. (Note that it isn’t course six NICs are better than one!
dancy for WireGuard on that individ- actually a failure since those NICs But the moral of the story is that it all
ual machine, I didn’t care, because I don’t actually exist. You might encoun- depends on the traffic. When I went
now had two servers running. If my ter this problem if you decide to use back and looked at what the services
new server went down, I could simply Proxmox with a bonded quad-port running on the other VMs were doing,
connect to the old one. card.) there wasn’t much traffic, and they

Figure 3: Comparing network speeds. Figure 4: Comparing ping times.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 67


IN-DEPTH
Teaming NICs

were managing anyway. Furthermore, I Lastly I would say to homelabbers, beneficial, but for the workload my serv-
am either using WireGuard, in which you’ve got to test to find out. With test- ers are running, bonding all of the con-
case I am locally connected and the ing, I quickly realized I was leaving per- nections gives the best results.
speed from my VPN connection to the formance on the table for no good rea- Good luck with your homelab, and
VM is local, or else I am using Home son. If I were running services that had definitely check out the tteck GitHub
Assistant or Paperless from its web in- lots of traffic or perhaps with a half page [4] for more on Proxmox helper
terface without having WireGuard run- dozen people using my Plex media scripts. Q Q Q
ning, in which case I don’t really care if server, then reserving a single dedicated
the VPN is going quickly at that mo- NIC for the VPN server would have been Author
ment or not. If I am at the cafe on my
Adam Dix is a me-
VPN and looking at my camera through Info chanical engineer
Home Assistant, which is probably the [1] Proxmox: and Linux enthusi-
worst case scenario for me, then there https://www.proxmox.com/en/ ast posing as an
are enough hops that any speed loss [2] Cockpit: English teacher after
from sharing a bond is negated by the https://cockpit-project.org/ playing around a
latency of that many hops anyway. [3] WireGuard: bit in sales and
With all of this in mind, my best bet https://www.wireguard.com/ marketing. You can check out some of
was to put as many NICs together as [4] tteck Proxmox GitHub page: his Linux work at the EdUBudgie Linux
possible in balance-alb mode. https://github.com/tteck/Proxmox website (https://www.edubudgie.com).

QQQ
RPi Flight Simulator Interface MAKERSPACE

MakerSpace
I 2 +ÆQOP\[QU]TI\WZQV\MZNIKM
WVI:I[XJMZZa8Q

.TaQVO0QOP
A Raspberry Pi running Linux with a custom I2C card and a
small power supply provides an interface for a real-time
flight simulator. By Dave Allerton

I
n a flight simulation, the equations converted, and read into a computer as
must be solved at a sufficiently fast digital values, and a flight simulator
rate that the motion (or dynamics) might have several hundred inputs. To
of the simulated aircraft appears to illustrate the problem, in a flight simu-
be smooth and continuous, with no de- lator that acquires data from 32 analog
lays or abrupt changes resulting from the inputs at 50Hz, the overall sampling
computations [1]. Typically, the real-time rate is 1,600 samples per second. Fur-
software in a flight simulator updates at thermore, the data must be sampled
least 50 times per second. In other words, with sufficient resolution (or accuracy),
all the computations must be completed typically 12-16 bits, and any latency re-
within 20ms, including the inputs from sulting from data acquisition by the
controls, levers, knobs, selectors, and simulator modules must be minimized.
switches, which must be sampled within To avoid any delays caused by simula-
the 20ms frame. tor modules waiting to capture data, a
Data acquisition of analog and digital dedicated I/O system can acquire the
inputs is potentially slow. In the case of data and transfer it to the simulator
analog inputs, the signals are sampled, modules over a local network.
Lead Image © innovari, fotolia.com

Figure 1: Simulator architecture.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 69


MAKERSPACE RPi Flight Simulator Interface

Requirements flight simulator is 12 bits. Because no of I/O functions, typically costing less
A real-time research flight simulator [2] commercial I/O cards for the RPi met than $10.
currently installed at Cranfield Univer- this specification in terms of the number One further attraction of an I2C interface
sity (Cranfield, UK), runs on a local net- of channels, resolution, and sampling is the simplicity of programming. Most
work of eight PCs, with the simulation rate, a custom solution was developed. transfers only require output of the device
functions partitioned as shown in Fig- address to select a specific register of a
ure 1. The I/O system provides an inter- I2C chip and then transfer of data to or from
face between the simulator and the soft- The 40 GPIO lines of the RPi include an external device. I2C chips are compliant
ware modules that comprise: the model- support for I2C transfers. The I2C proto- with the I2C data transfer protocol, so a de-
ing of the aircraft aerodynamics and the col, originally developed by Philips [3], signer only needs to ensure that the RPi
engine dynamics, aircraft systems, flight is an interesting approach to interfacing, activates the SDA and SCL pins in accor-
displays, navigation, avionics, an in- requiring only two lines to transfer data dance with the protocol, which is provided
structor station, control loading, sound between devices connected to an I2C in software by an I2C driver.
generation, flight data recording, three bus: a serial data line (SDA) and a serial Several I2C libraries are available for
image generators for a visual system, clock line (SCL). For the RPi, SDA and the main programming languages, in-
and an optional connection to Matlab. SCL are included in the GPIO pinout. I2C cluding i2c-tools and wiringpi, simplify-
Data is transmitted over the network as chip pinouts provide SDA and SCL, a ref- ing the development of application soft-
broadcast Ethernet UDP packets. erence voltage, ground, and control pins. ware for I2C devices. The i2c-dev library
Previously, the I/O system was based Additionally, some I2C chips include pins is integrated with libc for the RPi and,
on a PC with a set of industrial I/O cards to define the device address. The I2C for programming in C, includes the ap-
to acquire digital and analog inputs and protocol offers two advantages: First, the propriate header files i2c.h and
generate digital and analog outputs. connection to an RPi only requires a few i2c-dev.h.
However, the interface cards and the PC lines; second, a wide range of integrated A number of manufacturers support I2C
used in this I/O system were obsolete, circuits (ICs) is available for the majority for analog and digital data transfers. The
and the Raspberry Pi (RPi) offered a po-
tential replacement. The RPi has suffi-
cient performance to compute the I/O
functions in real time, and much of the
existing C code could be reused to run
under the RPi’s Linux operating system.
The RPi Ethernet port provides UDP con-
nection to the simulator computers.
The overall structure of the I/O system
is shown in Figure 2. The simulator out-
puts are connected to an existing break-
out card, which provides interconnec-
tions to the simulator and signal condi-
tioning. The analog multiplexer selects
one of 32 inputs, where the channel
number (0-31) is given by a 5-bit input.
The digital multiplexer selects one of
four groups of 8 bits, where the channel
number (0-3) is given by a 2-bit input.
The selected analog channel is sampled
by an analog-to-digital (A/D) chip, and
the digital inputs are read into an 8-bit
parallel buffer. The four analog outputs
drive an electrical control loading sys-
tem, which provides an artificial feel for
the control column and rudder pedals.
The breakout card and the I/O interface
are connected by a 50-way ribbon cable.
The primary requirement was to pro-
vide an I/O interface compatible with
the RPi, capable of sampling 32 analog
inputs and 32 digital inputs at 50Hz and
generating four analog outputs and 24
digital outputs, also at 50Hz, where the
resolution of the A/D conversion for the Figure 2: Interface system.

70 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


RPi Flight Simulator Interface MAKERSPACE

One MCP23008 is configured for eight


outputs to drive the two multiplexers,
and a second MCP23008 is configured
for eight inputs to read the digital inputs.
The MCP3221 has one analog input in
the range 0-5V, and the MCP4728 pro-
vides four analog outputs in the range
0-5V. The pin connections of these three
integrated circuits are shown in Figure 3.
The 5V supply reference VDD, the 0V
ground reference VSS, and the I2C sig-
nals SCL and SDA are common to all
three ICs. For the MCP23008, the address
lines A0, A1, and A2 can be pulled up to
VDD or grounded to select up to eight
Figure 3: Microchip I2C chipset. addresses. The data lines GP0-GP7 pro-
vide 8-bit input or output. The reset line
Microchip Technology family of devices and SCL signals between the different ¬RST is pulled up to VDD and the inter-
was selected for the I/O system because voltage levels; the Microchip Technology rupt line INT is not used. For the
it met the requirements and the cost components are connected to external de- MCP3221, the single-ended analog input
constraints and operates within the vices requiring a 5V reference, whereas is connected to pin 3. For the MCP4728,
0-5V range of the simulator equipment. the RPi operates with a 3.3V reference. the four analog outputs are available at
The MCP23008 parallel I/O expansion IC pins 6-9. The ready RDY line is not used
is an 18-pin chip, with eight data lines System Design and the output latching pin ¬LDAC line
that can be set individually as inputs or The requirement of the I/O system was is grounded.
outputs. The MCP3221 IC provides 12-bit to provide five functions: In effect, the board reduces to seven
A/D conversion with a sampling rate in • Controlling two multiplexers of the ICs, plus two support ICs, with five
excess of 20,000 samples per second. The breakout card
MCP4728 IC provides four 12-bit digital- • Reading the 32 Listing 2: Sampling Analog Input Channels
to-analog outputs, with a conversion time multiplexed digi- 01 for (chn=0; chn<=31; chn+=1)
of less than 6μs. The base addresses of tal inputs 02 {
these devices are factory set but can be • Reading the 32
03 outbuf[0] = 9; /* reg 9 channel number for
modified by selection of the address lines multiplexed ana- analogue mux */
or by reprogramming the address (not log inputs 04 outbuf[1] = (unsigned char) chn;
recommended for the faint-hearted). Sur- • Driving four an-
05
face-mount variants were selected for the alog outputs
06 messages[0].addr = MUX_ADR;
interface printed circuit board (PCB), al- (control loading
07 messages[0].flags = 0;
though many I2C chips are also available system)
08 messages[0].len = 2;
as dual in-line (DIL) packages. • Providing digital
09 messages[0].buf = outbuf;
The interface also includes connectors outputs for the
multiplexers, the 10 packets.msgs = messages;
to the breakout card and a voltage level
translator to connect the RPi with exter- simulator lamps, 11 packets.nmsgs = 1;

nal inputs and outputs operating at 5V. A and an LED di- 12 if (ioctl(i2c, I2C_RDWR, &packets) < 0)

Texas Instruments PCA9306 converts SDA agnostics panel 13 I2Cerror("unable to set the analogue MUX dir
reg");

Listing 1: Setting MUX 14

15 messages[0].addr = ADC_ADR;
01 buf[0] = 0;
16 messages[0].flags = I2C_M_RD;
02 buf[1] = 0; /* set for 8 outputs */
17 messages[0].len = 2;
03
18 messages[0].buf = inbuf;
04 messages[0].addr = MUX_ADR;
19 packets.msgs = messages;
05 messages[0].flags = 0;
20 packets.nmsgs = 1;
06 messages[0].len = 2;
21 if (ioctl(i2c, I2C_RDWR, &packets) < 0)
07 messages[0].buf = buf;
22 I2Cerror("unable to read ADC ch=%d\n", chn);
08 packets.msgs = messages;
23
09 packets.nmsgs = 1;
24 AnalogueData[chn] = (((unsigned int) inbuf[0] & 0xf)
10 if (ioctl(i2c, I2C_RDWR, &packets) < 0) << 8) + (unsigned int) inbuf[1];

11 I2Cerror("unable to set the MUX dir reg\n"); 25 }

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 71


MAKERSPACE RPi Flight Simulator Interface

MCP23008 ICs for digital input, digital #include <linux/i2c.h> A/D chip with an address ADC_ADR; this
output, and multiplexer control (40 #include <linux/i2c-dev.h> ioctl call is repeated for all the devices
bits); an MCP3221 for analog input; in use.
and an MCP4728 for analog output. #define DIGITAL_OUTPUT1_ADR 0x20 Two C structures are defined to access
One of the MCP23008 ICs drives eight #define DIGITAL_OUTPUT2_ADR 0x21 the I2C devices, where the fields of the
outputs for an LED display, an LM7805 #define DIGITAL_INPUT_ADDR 0x22 structures are defined in the header file
voltage regulator provides a stable 5V #define MUX_ADR 0x23 i2c-dev.h:
reference for the A/D chip and a #define LEDS_ADR 0x24

PCA9306 voltage level translator con- #define ADC_ADR 0x4d struct i2c_rdwr_ioctl_data packets;

verts I2C signals between the RPi #define DAC_ADR 0x60 struct i2c_msg messages[1];

(3.3V) and the Microchip Technology


ICs (5V). An additional I2C temperature Before accessing the I2C devices, it is es- Because the MCP23008 8-bit bidirec-
sensor was included on the board. sential to check that they are addressable tional buffers are dedicated to input or
with a simple test: output, the direction can be set on ini-
Software tialization. For example, the multi-
For the RPi model 3, the I2C driver is en- i2c = open("/dev/i2c-1", O_RDWR); U plexer (MUX) is set as an output in
abled by running raspi-config and se- /* check I2C device is available */ Listing 1.
lecting the I2C configuration setting if (i2c < 0) Similar code is used to set the other
(400KHz baud rate). With the I2C board I2Cerror("unable to access I2C U 8-bit buffers for input or output. As an
connected, the terminal command bus\n"); example, sampling the 32 analog input
if (ioctl(i2c, I2C_SLAVE, ADC_ADR) U channels is illustrated by the code in
i2cdetect -y -1 < 0) /* check A/D is accessible */ Listing 2. The multiplexer is set to the
I2Cerror("unable to access ADC U channel value chn, and the A/D chip
2
identifies the I C devices and their spe- (%2x)\n", ADC_ADR); value is read as 2 bytes to the array
cific addresses. The relevant I2C header inbuf. The result is formed by combin-
files must be included, and the I2C ad- The open function checks that access to ing the most significant four bits in
dresses of the devices are defined, in the the I2C devices is enabled. The ioctl call inbuf[0] with the least significant 8
program to improve readability: checks specific devices, in this case the bits in inbuf[1], which is stored in

Figure 4: I/O system schematic.

72 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


RPi Flight Simulator Interface MAKERSPACE

array AnalogueData[] of 32-bit unsigned Additionally, the RPi interface provides and data throughput. With the GNU
integers. With the I2C configured for a a timing reference for the simulator, GCC tool chain, programming of the
baud rate of 400Kbits/s, an RPi 3 ensuring accurate maintenance of the I2C devices was straightforward and re-
Model B samples 32 analog inputs in frame rate. quired only a few lines of code to ac-
8.4ms, which is less than half the 20ms cess each device.
frame time. Board Design The RPi provides a dedicated headless
For the flight simulator, after initial- The schematic is shown in Figure 4. The I/O system, loading and running auto-
ization, the I/O system repeatedly exe- PCB was produced as a four-layer board matically after power-up and with diag-
cutes a loop that comprises broadcast- (120mmx95mm) by Eagle CAD software nostic information on the system status
ing a UDP packet containing the sam- (Figure 5). The design illustrates the provided by a small LED panel. The in-
pled data, reading 32 analog inputs, simplicity of I2C interfacing for the data terface provides raw I/O data for the
reading 32 digital inputs, writing four acquisition application. simulator modules, enabling any scaling
analog outputs, writing four digital or conversion to be applied in the
outputs, responding to UDP packets Observations modules.
from the simulator PCs, and updating a I2C is a mature and stable protocol sup- A Raspberry Pi running under Linux
small LED display. The interface is ported by a wide range of integrated with an I2C interface and a small power
scalable and includes expansion for circuits in both DIL and surface-mount supply replaced a PC with two large in-
additional digital inputs and outputs. formats, mostly costing less than $10. dustrial I/O boards, reducing both the
The RPi provides footprint and the cost of the I/O system
an interface for for a real-time flight simulator. Much of
I2C devices, re- the existing I/O software was reused,
quiring only two and no changes were required to the
lines plus power simulator software. Q Q Q
and ground so
that construction Info
of an interface [1] Allerton, D. J. Principles of Flight Sim-
with breadboard, ulation. John Wiley and Sons, 2009
wire-wrap, or
[2] Allerton, D. J. Flight Simulation Soft-
PCB is straight-
ware: Design, Development and Test-
forward. For the
ing. John Wiley and Sons, 2023
flight simulator
application, I2C [3] Anonymous. I2C-Bus Specification and
fully meets the User Manual, Rev. 7.0 NXP Semicon-
requirements in ductors document UM10204, 2012:
terms of sampling https://www.nxp.com/docs/en/user-
Figure 5: I/O system PCB layout. rates, resolution, guide/UM10204.pdf

QQQ

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 73


MAKERSPACE BCPL

MakerSpace
BCPL for the Raspberry Pi

Before C
The venerable BCPL procedural structured programming
language is fast to compile, is reliable and efficient, offers a
wide range of software libraries and system functions, and
is available on several platforms, including the Raspberry Pi.
By Dave Allerton

I
n the 1960s, the main high-level adopted directly from, BCPL. Although
programming languages were For- BCPL also supported characters and
tran, Basic, Algol 60, and COBOL. bytes, the lack of richer types was ad-
To optimize code or to provide dressed in C, which became the pro-
low-level operations, assembler program- gramming language of choice for Unix
ming offered the only means to access (and subsequently Linux), leaving BCPL
registers and execute specific machine in- mostly for academic applications. Sev-
structions. BCPL, which was used as a eral groups developed compilers, operat-
teaching language in many universities, ing systems, software utilities, commer-
provided a language with a rich syntax, cial packages, and even flight simulation
addressed the scoping limitations of the software in BCPL, but for the most part,
other languages, and had low-level op- BCPL has been forgotten.
erations such as bit manipulation and The demise of BCPL in both academia
computation of variable addresses. and industry is disappointing, particularly
Where BCPL differs from the other lan- because it is a powerful teaching lan-
guages is that it is typeless; all variables guage, introducing students to algo-
are considered to be a word, typically 16 rithms, software design, and compiler de-
or 32 bits. Programmers can access indi- sign. Later, languages such as Pascal and
vidual bits and bytes of a word, perform Modula-2 became popular languages to
both arithmetic and logical operations on introduce concepts in computer science
words, compute the address of a word, or but have been superseded by Java, Py-
use a word as a pointer to another word. thon, and C++. Whereas the learning
One further novel aspect of BCPL is that curve for BCPL is small, enabling stu-
Lead Image © videodoctor, 123RF.com

the compiler is small and written in BCPL, dents to become productive in a short
producing intermediate code for a virtual time, the complexity of languages such as
machine and simplifying the development C++can be a barrier to students learning
of the compiler for a wide range of com- their first programming language.
puters. BCPL was used on mainframe
computers and minicomputers in the The BCPL Language
1970s and microprocessors in the 1980s. The example in Listing 1 of a small BCPL
The early developers of Unix were in- program computes factorial values from
fluenced by, and many aspects of C were 1! to 5!. Because C was developed from

74 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


BCPL MAKERSPACE

BCPL, the syntax of both languages is from a procedure, respectively, enabling This 32-bit implementation of BCPL
similar. The include directive in C is a procedures to be called recursively. compiles a BCPL program prog.b to
GET directive in BCPL, the assignment prog.o, where prog.o is a Linux object
operator = in C is := in BCPL, and the Portability module linked with two libraries –
fences (curly brackets) { and } are iden- BCPL was developed by Martin Richards blib.o and alib.o – by the gcc linker to
tical. In C the address of a variable a is in the Computer Laboratory at the Uni- produce an executable ELF module,
denoted by &a, whereas in BCPL it is versity of Cambridge. His more recent prog. The library blib.b is written in
given by @a. Indirection, or the use of Cintcode implementation is extensive BCPL and contains the common BCPL li-
pointers, is given by *a in C or !a in and provides numerous examples of cod- brary functions. A small library alib.s is
BCPL. Arrays are organized so that a!b ing, mathematical algorithms, and even written in Linux assembler and contains
in BCPL corresponds to a[b] in C. operating system functions. The advan- low-level functions to access the Linux
The GET directive includes the com- tages of this implementation are consid- runtime environment.
mon procedures and definitions erable: It is fast to compile, is reliable Although the gcc linker builds the
needed in the compilation of a pro- and efficient, and offers a wide range of executable program, the object code
gram. The procedure start is similar to software libraries and system functions. produced by the compiler contains
main in C, where the VALOF keyword de- It is also available on several platforms, only blocks of position-independent
notes that start is a function with the including the PC and the Raspberry Pi. code, requiring no relocation. At runtime,
result returned by the RESULTIS key- The only drawback is the loss of speed alib initializes the BCPL environment,
word. The variable i, a local variable from interpreting the compiled code. setting up the workspace for the stack
of the procedure start, is implicitly de- I refer you to Martin Richard’s text- and global and static variables. Strictly,
fined at the start of the FOR loop, which book [1], and his website [2] which in- gcc is only used to generate a Linux-
is executed five times. The writef func- cludes a version of Cintcode, that is compatible module that can be loaded,
tion is similar to printf in C. The re- straightforward to download and imple- whereas the linking of a BCPL program
cursive function fact tests whether n is ment on an RPi. Also, a guide directed and libraries is performed by alib.
zero and returns either 1 or n*(n-1)!, at young people programming a Rasp-
where the parameter n is a local vari- berry Pi [3] provides an extensive de- Notes for Developers
able of the procedure fact. scription of BCPL and the Cintcode cThe compiler uses registers r0 to r9
In BCPL, a variable is defined as a implementation and numerous exam- for arithmetic operations, logic opera-
word that can represent an integer, a bit ples of BCPL programs. tions, and procedure calls. The code gen-
pattern, a character, a pointer to a string For the programmer intending to write erator attempts to optimize the code by
of characters, a floating-point number, or applications in BCPL that exploit the keeping variables in registers, minimiz-
an address. A programmer can apply processing power of the ARM cores of a ing the number of memory accesses.
arithmetic operators, logical operators, Raspberry Pi, a BCPL compiler generat- Register rg points to the global vector,
shift operators, an address operator, or ing ARM instructions directly is likely to and register rp is the BCPL stack pointer
indirection to a variable – the compiler produce code which runs considerably or frame pointer. Procedure linkage, pro-
assumes that the programmer knows faster than interpreted code. For other cedure arguments, and local variables
what they are doing and, subject to syn- users less concerned with processing are allocated space in the current frame.
tactic and sematic compilation checks, speed, the tools and support provided by Stack space is claimed on entry to a pro-
places very few constraints on program- the Cintcode implementation of BCPL cedure and released on return from a
ming constructions. Arguably, C and offer a stable and reliable platform. procedure. The link register lr holds the
BCPL fall into the category of languages return address on entry to a procedure
that provide almost unlimited power for BCPL for the Raspberry Pi and can also be used as a temporary reg-
a programmer with very few checks on The arrival of the Raspberry Pi with its ister within a procedure. The system
their intention. ARM cores, network connection, sound stack pointer sp is not used by the BCPL
Both C and BCPL allow sections of a and video outputs, USB ports, and I/O compiler, so it can be used to push and
program to be compiled separately (e.g., interface running under the Linux op- pop temporary variables. The compiler
to provide a library of functions). Global erating system has encour-
variables and procedures in BCPL, aged the development of a Listing 1: 1! to 5! in BCPL
which are similar to external variables range of programming lan- 01 GET "libhdr"
and functions in C, can be accessed by guages for this platform. A 02
all sections of a program, whereas static code generator for BCPL that 03 LET start() = VALOF
variables are only accessible from the I developed compiles BCPL 04 {
section in which they are declared. The directly to ARM machine 05 FOR i = 1 TO 5 DO
other category of variables is local or dy- code, which can be linked
06 writef("fact(%n) = %i4*n", i, fact(i))
namic variables, which are declared and with the standard Linux gcc
07 RESULTIS 0
used in the same way as C. When a local toolset. The compiler (7,000
08 }
variable is declared, space is allocated lines) compiles itself in less
09
on a stack, which grows and shrinks dy- than 0.2 seconds on a Rasp-
10 AND fact(n) = n=0 -> 1, n*fact(n-1)
namically, typically on entry to and exit berry Pi 4B.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 75


MAKERSPACE BCPL

uses the BCPL stack for procedure link- converted to C strings, if calling C. push {rg, rp, lr}

age and the storage of local variables. It (2) Addresses of variables, vectors, and pop {rg, rp, lr}

should be noted that the ARM core is a strings in BCPL are word addresses,
pipelined processor and reference to pc whereas they are machine addresses in The code produced by the code genera-
during an instruction implies the address C. Passing an address from BCPL to C tor for the factorial example is shown in
of the current instruction+8 for most in- requires a logical left shift of two Listing 2 with comments to explain spe-
structions. The program counter pc is places, and passing an address from C cific instructions. Note that register r0 is
used in the code generation of relative to BCPL requires a logical right shift of reloaded at location 0x38 because it is
addresses used for procedure calls and two places. Care is needed with strings reached by code from locations 0x34 and
branches and also in switchon expres- in C because they are not necessary 0x74; consequently, the content of regis-
sions in BCPL. aligned on 32-bit word boundaries. ter r0 is not assured. Additionally, the
Although Linux libraries are not ex- In both C and BCPL, the registers r0-r9 reference to the string
plicitly linked, the libc library is avail- are not preserved across procedure calls.
able to BCPL programs. Fortunately, the Additionally, the BCPL registers rp, rg, and "fact(%n) = %i4*n"

register calling mechanisms of the GNU lr cannot be guaranteed to be preserved in


gcc tool chain and BCPL are distinct and C, and it is advisable to store these regis- is not known at location 0x4C when the
independent. The BCPL stack grows up- ters before calling a C procedure. In prac- instruction is generated; therefore, a full
ward, with no access or modification to tice, they can be pushed onto the system static reference is generated with the off-
the system stack. In C, the stack grows stack and popped on return by: set 0x00000028 stored at location 0x90.
downward, and local variables are stored
relative to the system stack pointer sp. Listing 2: Code Generator Output
Consequently, it is possible to call C
0: 0000003c data Section size (words)
functions from BCPL.
4: 0000fddf data Section identifier
In the ARM Procedure Call Standard
8: 6361660b data Section name “fact”
(APCS), the first four arguments are
c: 20202074 data
loaded into registers r0, r1, r2, and r3,
10: 20202020 data
respectively, and a result is returned in
14: 0000dfdf data Entry identifier
register r0. The address of the procedure
is computed, and the procedure is called 18: 6174730b data Procedure name “start”
by an appropriate branch and link (bl) 1c: 20207472 data

instruction or a branch, link, and ex- 20: 20202020 data

change instruction (blx). 24: e8a4c800 stmia r4!,{fp,lr,pc} Standard procedure entry
However, C and BCPL have two im- 28: e884000f stm r4,{r0,r1,r2,r3}
portant differences: (1) BCPL strings 2c: e244b00c sub fp,r4,#12
are defined by the string size in the first 30: e3a00001 mov r0,#1 Initial value i
byte followed by the 8-bit characters of 34: e58b000c str r0,[fp,#12] Save i
the string, whereas strings in C are ar- 38: e59b000c ldr r0,[fp,#12] Load i
rays of 8-bit characters terminated with 3c: e28b4024 add r4,fp,#36 Set new stack frame
a zero byte. BCPL strings must be 40: eb000017 bl 0xa4 Call f(i)
44: e1a02000 mov r2,r0 Arg 3 = f(i)
Table 1: BCPL Registers 48: e59b100c ldr r1,[fp,#12] Arg 2 = i
Register Name Function 4c: e59fe03c ldr lr,[pc,#60] Arg 1 = “fact(%n) = %i4*n”
0 r0 Data register 0 50: e08f000e add r0,pc,lr pc offset
1 r1 Data register 1 54: e1a00120 lsr r0,r0,#2 BCPL address
2 r2 Data register 2 58: e28b4010 add r4,fp,#16 Set new stack frame
3 r3 Data register 3 5c: e59ae178 ldr lr,[sl,#376] Global writef
4 r4 Data register 4 60: e12fff3e blx lr Call writef()
5 r5 Data register 5 64: e59b000c ldr r0,[fp,#12] Load i
6 r6 Data register 6 68: e2800001 add r0,r0,#1 Increment by 1
7 r7 Data register 7 6c: e58b000c str r0,[fp,#12] Store i
8 r8 Data register 8 70: e3500005 cmp r0,#5 Check end of for-loop
9 r9 Data register 9 74: daffffef ble 0x38 Continue for-loop
10 rg Global vector 78: e3a00000 mov r0,#0 Return 0
7c: e89b8800 ldm fp,{fp,pc} Standard procedure return
11 rp BCPL stack
80: 6361660f data String “fact(%n) = %i4*n”
12 ip Unused
13 lr Link register 84: 6e252874 data

14 sp System stack pointer 88: 203d2029 data


8c: 0a346925 data
15 pc Program counter

76 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


BCPL MAKERSPACE

Listing 2: Code Generator Output (continued) Installation


The file bcpl_distribution [4] contains
90: 00000028 data
the files shown in Table 2. The object
94: 0000dfdf data Entry identifier files bcpl.o and blib.o each contain a
98: 6361660b data String “fact” block of position-independent code. The
9c: 20202074 data
assembler module leader.s provides a
a0: 20202020 data
means of identifying the start of a BCPL
a4: e8a4c800 stmia r4!,{fp,lr,pc} Standard procedure entry program. The runtime library alib.s is
a8: e884000f stm r4,{r0,r1,r2,r3} written in assembler code and includes
ac: e244b00c sub fp,r4,#12 data regions for the global variables and
b0: e3500000 cmp r0,#0 Test n=0 static variables and is linked to the GNU
b4: 1a000001 bne 0xc0 Skip if not C runtime library libc. Note that the files
b8: e3a00001 mov r0,#1 Return 1 bcpl.b and bcplfecg.h are only needed to
bc: e89b8800 ldm fp,{fp,pc} Standard procedure return rebuild the compiler and are not re-
c0: e59b000c ldr r0,[fp,#12] Load n quired for user applications.
c4: e2400001 sub r0,r0,#1 Decrement n The distribution also includes several
c8: e28b4010 add r4,fp,#16 Set new stack frame BCPL examples and a user guide (Table 3).
cc: ebfffff4 bl 0xa4 Call f(n-1) The programs queens.b and primes.b are
d0: e59b100c ldr r1,[fp,#12] Get n described in Martin Richard’s excellent
d4: e0000190 mul r0,r0,r1 Return n*(n-1) notes to young people interested in pro-
d8: e89b8800 ldm fp,{fp,pc} Standard procedure return gramming the Raspberry Pi [3].
dc: 00000000 data No statics To install BCPL on Raspberry Pi
e0: 00000000 data Start of global vector Model 3 or 4, create a new directory
e4: 00000001 data Global 1 (start) and copy the distribution files in
e8: 00000024 data Offset to global 1 bcpl-distribution to this directory. Al-
ec: 0000005e data Maximum global of the section ternatively, to install BCPL on a Rasp-
berry Pi Model 2, copy the distribution
MAKERSPACE BCPL

files in bcpl-distribution-rpi2. In a ter- >sudo cp bcpl /usr/bin/ The logic simulator HILO-2 (the fore-
minal shell, enter the commands >sudo cp leader.o /usr/lib/ runner of Verilog) was developed in
>sudo cp blib.o /usr/lib/ BCPL. Numerous utilities, including
>unzip bcpl-distribution.zip >sudo cp alib.o /usr/lib/ the early word processor roff were
>as leader.s -o leader.o written in BCPL. Before the availability
>as alib.s -o alib.o The remaining BCPL programs can now of floating-point hardware, I adapted
>gcc leader.o bcpl.o blib.o alib.o -o bcpl be compiled and run with the command BCPL compilers for the Motorola 6809
bcpl rather than ./bcpl. The compiler and 68000 processors to use scaled
to build and test the compiler (> denotes searches for library files in the working fixed-point arithmetic in real-time
the Linux prompt). directory before searching the directories flight simulation. Q Q Q
For a first compiler test, compile and /usr/include/BCPL and /usr/lib.
run the program fact.b, which prints the Info
factorial numbers from 1! to 5!: Nostalgia [1] Richards, Martin. BCPL: The Language
The influence of BCPL on the develop- and its Compiler, revised ed. Cam-
>./bcpl fact.b -o fact ment of C and its later variants cannot bridge Univ Press, 2009:
>./fact be overstated. The availability of https://www.amazon.com/BCPL-
BCPL for the Raspberry Pi allows old Language-Compiler-Martin-Richards/
Further confidence tests rebuild the computer science students to dust off dp/0521286816/ref=sr_1_1
BCPL compiler bcpl.b with the BCPL copies of their programs, which [2] Martin Richards:
compiler and build the library blib.b: should run directly on the Raspberry https://www.cl.cam.ac.uk/~mr10/
Pi. BCPL was used extensively in [3] Richards, M., Young Persons Guide to
>./bcpl bcpl.b -o bcpl many UK university computer science BCPL Programming on the Raspberry
>./bcpl -c blib.b departments. The portable multi-task- Pi Part 1. Cambridge (UK): Computer
ing operating system Tripos was writ- Laboratory, University of Cambridge,
The BCPL library files and the compiler ten entirely in BCPL in the Computer revised 23 Oct 2018:
can then be copied to the appropriate Laboratory at the University of Cam- https://www.cl.cam.ac.uk/~mr10/
Linux shared directories: bridge and used in early versions of bcpl4raspi.pdf
the Commodore Amiga, in the auto- [4] Code for this article:
>sudo mkdir /usr/include/BCPL motive industry, and in financial https://linuxnewmedia.thegood.cloud/
>sudo cp libhdr.h /usr/include/BCPL/ applications. s/9nFQcFb2p8oRMEJ

Table 2: bcpl_distribution Author


File Name Function Dave Allerton obtained a PhD from the
alib.s A runtime library written in GNU ARM assembler University of Cambridge in 1977 and
blib.b The BCPL runtime library, written in BCPL worked in the defense industry before
blib.o A precompiled version of the BCPL runtime library blib.b spending 10 years at the University of
bcpl.b Southampton as a lecturer in computing.
The BCPL compiler and code generator to run under Linux
He was the Professor of Avionics at Cran-
bcpl.o A precompiled version of the BCPL compiler and code generator
field University before moving to the Uni-
bcplcg.b The code generator used by the BCPL compiler for the ARM processor
versity of Sheffield as Professor of Com-
bcplfecg.h A header file used by the code generator
puter Systems Engineering, where he is
leader.s A small assembler program only used to locate the start of a BCPL program
currently an Emeritus Professor. He is also
libhdr.h The standard BCPL header a Visiting Professor at Cranfield University
and at Queen Mary University of London.
Table 3: BCPL Examples and User Guide His research activities include flight simu-
File Content lation, computer graphics and real-time
bench.b A small program to time the execution of a small fragment of BCPL computing. He is author of two textbooks,
fact.b Principles of Flight Simulation (Wiley,
A small program to print the factorial numbers from 1! to 5!
2009, ISBN 978-0-470-75436-8) and Flight
primes.b A small program to print the prime numbers less than 1,000
Simulation Software: Design, Develop-
queens.b An implementation of the “Queens” problem for 1 to 16 pieces
ment and Testing (Wiley, 2022, ISBN 978-
guide.pdf A guide to BCPL for the Raspberry Pi, including installation notes 1-11973-767-4).
QQQ

78 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


INTRODUCTION LINUX VOICE

Phones? Computers? Calendars? Music devices? Wasn’t


everything supposed to converge? At least that was the
dream 10 years ago. Fast forward to today, and a lot of
modern utilities are ending up as apps on your
cellphone, but computers are still on the outside looking
in. Or are they? This month we take a look at Waydroid, a
tool that lets you run Android apps on your Linux system.
-JɄ]SYLEZI%RHVSMHETTWXLEXEVI[SVOMRK[IPPJSV]SY[L]
not keep them handy on your Linux
desktop? Also in this month’s issue,
we show you how to contend with
files compressed in the not-free
RAR format.
Doghouse – What Is Fun? 80
Jon “maddog” Hall
This month maddog writes about what
makes free software fun for him.
Compressing Files with RAR 81
Image © Olexandr Moroz, 123RF.com

Ali Imran Nagori


The non-free RAR compression tool offers
UQOGDGPGƒVU[QWYQPŨVƒPFYKVJZIP and TAR.
FOSSPicks 84
Graham Morrison
This month Graham looks at osci-render,
Spacedrive, internetarchive, LibrePCB 1.0.0,
and more!
Tutorial – Waydroid 90
Harald Jele
9C[FTQKFDTKPIU#PFTQKFCRRUVQVJG.KPWZ
FGUMVQRKPCUKORNGCPFGHHGEVKXGYC[

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 79


LINUX VOICE DOGHOUSE – WHAT IS FUN?

MADDOG’S
Jon “maddog” Hall is an author,
educator, computer scientist, and free
software pioneer who has been a
DOGHOUSE
This month I want to write about what makes free
passionate advocate for Linux since
1994 when he first met Linus Torvalds software fun for me. BY JON “MADDOG” HALL
and facilitated the port of Linux to a
64-bit system. He serves as president
of Linux International®.
Not just the tech
riting software has always been fun for me. It is like a It was about a decade ago when I was at CeBIT, at that

W puzzle, something to be solved with logic, following


certain rules. Very little of my work in programming
was writing new software. Most of my work in programming
time the world’s largest computer show in Hanover, Ger-
many, that I was approached by three people individually
who told me that listening to me talk had guided them in
was to take things that other people had written and make them their careers. One was the head of their programming team,
run faster or make them simpler to use. another was the CTO of their company, and one had started
Later in my career I did less programming (yes, I still do a little a company based on free and open source software. All
programming today for my own use) and did more in guiding three of them pointed to a talk that I had given and how they
others to do useful things. started down the FOSS path.
Now, in the twilight of my career, retired from “professional As I go around the world, I meet more and more people who
programming” but still volunteering on various projects, I hope tell me that I had a great influence on their career and their lives
to continue guiding others, particularly younger people. And for by telling them about free software, or open hardware, or free
this I advocate free software and community cooperation. culture.
My entire family (other than my mother and father) worked I met a man in Brazil who told me that when he was 16 years
for the telephone company AT&T at one time or another. My old he had nothing. No college education, no skills. But he went
fraternal grandmother, her daughter (my aunt), my uncle (my to the library and started teaching himself how to be a Linux
aunt’s husband), my brother, and sister-in-law all worked for systems administrator. He practiced on cast-off computers that
various branches of AT&T. My co-op jobs through Drexel Uni- other people considered junk and eventually got a job doing
versity (née Drexel Institute of Technology) were with the that. He kept studying, eventually getting a university degree in
Western Electric Company (the manufacturing arm of the computer science, and today is a professor teaching other
Bell System). students.
I first learned programming by taking a correspondence Another Brazilian was working in a bank at the age of 18 and
course in “How to Program the IBM 1130 in FORTRAN” through living in a favela. The bank was throwing out some computers
their educational program. After graduating from Drexel and and he asked if he could take them home. He reconfigured
following a couple of career changes, I worked for Bell Labora- them, installed Linux, and trained himself in system and net-
tories, which is where I learned Unix as a Unix systems admin- work administration. There was very little Internet in the favela,
istrator. I am telling you all of this because I had a very deep so he decided to start a company installing and selling WiFi
knowledge of telephone switching systems, including what is there. People laughed at him and told him that no one in the
known as a private branch exchange (PBX) that was an elec- favela would ever pay him for that. Eventually he employed six
tronic switchboard used by companies, hotels, restaurants, people full time in his WiFi company.
government installations, and many other uses. All of the people I have met and who have benefited from free
These PBX systems would usually start at $20,000 to software are what makes computer science fun for me. It is not
$30,000 and go up from there. Therefore when I saw a book the technology itself, although I still like learning about the tech-
that was entitled Asterisk: A Free and Open Source PBX, I in- nologies, but the people and seeing them improve their lives
stantly knew what it was and what it meant. and pass on their knowledge and experiences to the next gener-
I traveled to a users’ meeting of Asterisk called AstriCon and ation of young people.
met the founder and architect of Asterisk. He told me that he I loved the early days of Linux, where the “crazies” met in
had been considering making Asterisk proprietary and closed groups to share their knowledge with other “crazies” who rev-
source, but after listening to one of my talks he decided to make eled in sharing ideas on things such as Tux (the Linux mascot).
it FOSS, and that had made all the difference. I want that fun to continue. Q Q Q

80 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


COMPRESSING FILES WITH RAR LINUX VOICE

From bytes to bits

Hear Me RAR
The non-free RAR compression tool offers some benefits you
won’t find with ZIP and TAR. BY ALI IMRAN NAGORI

rchiving files is like preserving your digital on Ubuntu and Debian-based distributions, you

A legacy in a time capsule. It gives you a


safety net against unexpected computer
crashes or data loss, ensuring you can always re-
can use the following command:

$ sudo apt install rar unrar

cover important files. That’s why file compression


tools are essential in the realm of Unix-based The sudo command is crucial to ensure that you
operating systems such as Linux. have the necessary privileges to carry out the instal-
As a Linux user, you’re probably familiar with file lation task. Once UnRAR is set up, the unrar com-
compression formats such as ZIP and TAR. How- mand is all set to extract compressed files. But with
ever, you might also come across RAR files from RAR, things aren’t that smooth, because it’s a propri-
time to time. Unlike ZIP and TAR, RAR is commer- etary program. That means you only get the trial
cial software [1]. You can use RAR for free for up to time of 40 days. You’ll then need to register to keep
40 days; then you’ll need to buy a license, which cur- using it. But that’s plenty of time to give it a spin.
rently costs around $29. You might be wondering
why a Linux user would pay money for a non-free Creating Simple RAR Archives
compression tool when ZIP and TAR are available With RAR installed, it’s time to create your first
for free. The answer is that RAR offers some bene- RAR archive. The rar command uses the follow-
fits when compared to the alternatives, including: ing syntax to create archives from files:
• Higher compression ratio: RAR often provides
better compression ratios, resulting in smaller $ rar <option> <name_of_archive> <file_1 file_2U

file sizes. ...file_N>

• Password protection: RAR allows for strong


password protection, ensuring your sensitive Let’s get a grasp of the meaning of this peculiar
data remains secure. syntax. option defines the commands and
• File splitting: RAR’s ability to split archives switches for each of the various file operations.
into smaller parts is handy for sharing or stor- name_of_archive is the name of the file that RAR will
ing large files. produce as output, and the sequence file_1
But even if you don’t choose to make RAR your file_2.file_N is a list of the files that will be com-
go-to compression utility, you might receive a RAR pressed. There are lots of options you can use with
file from someone else sometime and need to the rar command [2]. You can take a look at these
know what to do with it. This article describes the options by simply running RAR alone (Listing 1).
process of working with RAR files in Linux, from
installation to extraction and more. Listing 1: RAR options
01 $ rar
Getting Started with RAR 02 Type 'rar -?' for help
Linux does not come with RAR support out of the 03 Usage: rar - -
box. To get started with RAR files, you’ll need to in- 04 <@listfiles...> <path_to_extract\>

stall the RAR and UnRAR command-line utilities. 05 a Add files to archive

Furthermore, if you want to make sure you’re get- 06 c Add archive comment
07 ch Change archive parameters
ting the latest upgrades and maintaining compati-
08 cw Write archive comment to file
bility with proprietary RAR archives, it’s best to
09 d Delete files from archive
stick with the official RAR and UnRAR applica- 10 e Extract files without archived paths
tions. To install these applications, you can use 11 ...
your distribution’s package manager. For example,

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 81


LINUX VOICE COMPRESSING FILES WITH RAR

All right, that’s enough of the technical jargon. $ rar a backup.rar file1.txt server.logs users.csv

Let’s put RAR into action and see what it can actu-
ally do. Take some simple text files, say file1.txt, This will create a neat RAR archive named backup.
server.logs, and users.csv, and simply use the rar rar containing file1.txt, server.logs, and users.
command with the subcommand a. Next, put the csv. Interestingly, the -r recursive option lets you
name of the archive you want to create and the add directories whether they include files or not
files you want to include (Figure 1). For example: (Figure 2):

$ rar a -r my_secure_archive.rar BBB/ AAA/ sampleU

.txt

What ends up happening is that everything below


the directory gets compressed as well. That’s a
pretty good thing you might need.

Password-Protected RAR Archives


Security will always be important. That’s why
RAR allows you to protect your archives with
Figure 1: Compressing multiple files with RAR passwords. To create a password-protected
RAR archive, use the -p option followed by your
desired password:

$ rar a -r my_secure_archive.rar BBB/ AAA/ sampleU

.txt -p<my_password>

Just replace the placeholder <my_password> with


your password as shown in Figure 3. Or you can
leave it blank to let the terminal prompt you to
enter the password. That’s all. Your archive, my_
secure_archive.rar, is now password protected.

Creating a Split Archive


Figure 2: Multiple-file and directory compression using RAR. Got a big file to send? Don’t worry. RAR will fix your
file for easier sharing and storage. You can split a
large archive into smaller parts using the -v option
followed by the desired size and unit (e.g., k for ki-
lobytes, m for megabytes) [3]:

$ rar a -v50m my_split_archive.rar <some_largeU

_file>

Executing this command will result in the creation


of several RAR files (Figure 4), each nearly packed
with a maximum size of 50 megabytes.

Let’s Go Extracting
Figure 3: Creating a password-protected RAR archive. Let’s now do some extraction jobs. Extracting
RAR files is pretty much the same as creating
one. However, there is no vendor lock on the
programs that extract the RAR files. You can
choose from multiple options such as WinZip,
WinRAR, 7-Zip, etc. For the time being, let’s go
with the traditional UnRAR program.
First things first, you can extract the archive
to the same directory it is located in. This will
not keep the original directory layout intact
(Figure 5). The directory structure will be lost,
and all items will be put into the single direc-
Figure 4: Creating a split archive with RAR. tory you’re in. To accomplish this task, you

82 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


COMPRESSING FILES WITH RAR LINUX VOICE

need to use the e subcommand with rar. Here's • You can use WinRAR with any language
how it’s used: version.
• One key grants you the liberty to activate
$ unrar e my_secure_archive.rar RAR on multiple devices, provided it’s for
noncommercial use.
Besides copying the files, it extracts subdirecto- • You get professional support right from the
ries without actually recreating them. Sometimes, support staff.
it might hurt you if you can’t get the original layout.
But no worries, there is a way out to keep the full Conclusion
directory path (Figure 6). Just hit up the option x. It While free options are available, RAR’s ease of use Info
will do the trick for you: and feature set make it a solid choice if you’re will-
[1] RAR for Linux and
ing to invest in a license. In conclusion, working
Mac: https://www.
$ unrar x my_secure_archive.rar with RAR files in Linux is straightforward once you
win-rar.com/rar-linux-
have the RAR and UnRAR utilities installed.
mac.html?&L=0
Pretty cool, right? These files get extracted right Whether you’re creating simple archives, adding
into your current directory, maintaining their origi- password protection, or splitting files, RAR offers [2] RAR manpage:
nal tree structure intact. a range of features that can be valuable for man- https://manpages.
What about unpacking an archive to a preset aging your data. Just keep in mind the proprietary ubuntu.com/
directory? For this, option -o is at your disposal: licensing when considering its use. Q Q Q manpages/en/
man1/rar.1.html
$ unrar e my_secure_archive.rar -o <some_U
The Author [3] Multi-volume RAR ar-
directory_path>
Ali Imran Nagori is a technical writer and chive: https://www.
win-rar.com/split-files-
Extracting Password-Protected RAR Archives Linux enthusiast who loves to write about
archive.html?&L=0
If a RAR file is locked down with a password, you Linux system administration and related
have to make sure to drop that fancy password [4] RAR license: https://
technologies. He blogs at tecofers.com. You
when you’re opening it. The -p option comes in www.win-rar.com/
can connect with him on LinkedIn.
handy here. See Figure 7: winrarlicense.html?&L=

$ unrar e my_secure_archive.rar -p<password>

A password ensures that potential intruders can’t


touch your files.

Licensing Model of RAR


RAR and WinRAR are commercial software, but
they are also shareware or trialware. This
means that you can use them for free for a trial
period, typically 40 days. After the trial period Figure 5: Extracting an RAR archive without layout preservation.
ends, you must purchase a license to continue
using the software [4]. RAR and WinRAR li-
censes are perpetual, meaning that they are
valid for the lifetime of the software. When it
comes to a license, you get some serious free-
dom. You’re simply the boss here.
You can use your license on any computer that you
own or control. However, you cannot transfer your li-
cense to another person without their permission.
There are two types of RAR and WinRAR licenses:
• Single-user licenses: You purchase one li-
cense to use RAR archiver on one computer.
• Multi-use licenses: This license requires busi- Figure 6: RAR extraction with original layout.
ness users to get one license per computer. In
a network (server/client) environment you
must purchase a license copy for each sepa-
rate client (workstation) on which RAR or Win-
RAR is installed, used, or accessed.
A RAR license lets you enjoy many perks. Here
are a few of them: Figure 7: Extracting a password-protected archive.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 83


LINUX VOICE FOSSPICKS

FOSSPicks Sparkling gems and new


releases from the world of
Free and Open Source Software

A word of caution for some of these finds. Graham managed to break


his speakers and invoke tinnitus after playing with osci-render too long
for this issue. BY GRAHAM MORRISON
Oscilloscope music an audio signal that both create pleasing sine wave-

osci-render sounds interesting (musical


may be a stretch too far) and
looks amazing on an oscillo-
like sounds and more com-
plex objects generating lots
of competing harmonics. A
espite a hardware user in- voltages measured from a crystal scope screen. The process single triangle is an excellent

D terface festooned with


knobs and buttons, oscil-
loscopes perform a rather mun-
oscillator or a microprocessor, and
these could look like sine waves, or
square pulses. Because they’re
starts with a series of com-
plex transformations from X
and Y coordinates into audio
source, for example, but you
need to add object move-
ment to animate the sound
dane function: They trace changes also electrical signals, an audio voltages that render as a pat- and the image.
in input voltage over time. One signal through a wire is no differ- tern or image on the trace. There are several 3D ef-
input translates changes into ent, and oscilloscopes are often Creating those transforma- fects too, including wobble
movement along one axis while a used to visualize stereo audio sig- tions has always been diffi- and bitcrushing, which break
second input translates changes nals. The output won’t look good cult and has spawned com- apart the model into a series
onto the other axis. When the two on screen, but you can see from mercial software for those of lines and sound artifacts.
input voltages are combined, the this kind of trace whether the two interested in exploring the Many of these can be auto-
trace can move anywhere within inputs are in phase or compatible transformations further. And mated with MIDI signals, al-
the X and Y area of the screen. with mono speaker equipment. there hasn’t been an open lowing you to generate both
They’re intended to visualize wave Remarkably, there’s a sub-genre source option until now. sound and video as a kind of
cycles within circuits, such as the of electronic music that generates Osci-render is a graphical ap- electronic music perfor-
plication that can be used to mance. With careful plan-
transform a 3D model, text, ning, the results can be both
an SVG file, or even a Lua visually and audibly stunning,
script into a stereo audio sig- and you can record the raw
nal that will regenerate the audio output directly from
image on an oscilloscope. If the application. If this isn’t
you’re into experimental enough, the project includes
electronic music, it can also an add-on for Blender which
sound amazing. links the main camera view
It sounds complicated, to a running instance of osci-
but it’s easy to get started render, transforming what-
because the default project ever the Blender camera
loads a 3D cube model by sees into audio for an oscillo-
default. Connect your audio scope. This means you can
output to an oscilloscope, or set up and script a far more
use the web browser oscillo- complex animation within
scope that can be loaded Blender using its keyframe
from the main application, and staging tools, and run
and you can see this cube the output directly into an
1. Input: Load an OBJ 3D model, enter text, or generate images with Lua. 2. Preview: immediately. There are con- oscilloscope. It may be
If you don’t have an oscilloscope, you can use your web browser to preview the resul- trols for rotating, zooming, niche, but it’s a lot of fun,
tant animation. 3. Audio effects: In changing the sound, these effects transform the and transforming the object, and if you’re careful, it can
visuals in fascinating ways. 4. Output: Record the audio directly, or output the audio and these affect the sound sound and look absolutely
to your speakers and oscilloscope. 5. Controls: There are many controls for rotating, that subsequently builds the amazing.
reflecting, scaling, and animating your input. 6. MIDI control: Osci-render can be used image. The timbre of the
for live performance with MIDI used to control the values remotely. 7. Frequency: audio depends on the com- Project Website
The overall pitch of the audio can be changed to fit your music. 8. Animation: Values plexity of the object, with https://github.com/
can modulate themselves to add their own variation in the output. simple objects more likely to jameshball/osci-render

84 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


FOSSPICKS LINUX VOICE

File manager

Spacedrive
e’ve looked at many After first creating a library –

W different file managers


on the command line,
in a browser, and on the desktop.
or creating a new, separate li-
brary for a different abstraction
of the files you want to access –
To differentiate themselves, they the file management interface is
each took a unique approach to very similar to Dolphin’s on KDE,
some aspect of file management, especially with the dark color
whether that was integrating net- scheme. You can switch be-
work access (Dolphin), pure desk- tween an icon grid view, a list
top integration (Gnome Files), or view, and a media playback view.
aping 1990’s DOS functionality The latter shows a preview of Spacedrive is open source and cross-platform, with macOS and
(Midnight Commander). Space- photos and movies. Also like Dol- Windows builds alongside Linux and a promise for an Android client.
drive’s USP is unity, unifying ac- phin, you can tag files and direc-
cess to all your files and directo- tories with labels for easier re- this state it’s attracted substantial venture capital for fur-
ries, wherever they may be lo- trieval. Internally, Spacedrive is ther development. This is reminiscent of the ancient days
cated. It does this by implement- creating and maintaining its own of Helix Code, Eazel, Nautilus, and the Gnome desktop,
ing its own Virtual Distributed File metadata database of every item but Spacedrive’s investment will hopefully result in a self-
System (VDFS) to provide a sin- you add to each library so that sufficient project. You can see the beginnings of this in an
gle API to manipulate and access search and retrieval can be as optional account login. But the project is genuinely open
various different back ends. quick as possible, regardless of source, and offers a unique new take on how to manage
These back ends include local where the items are stored. The files in an increasingly disparate world of personal data.
storage, external storage, and application is at an early stage of
network locations, which are development and considers itself Project Website
combined into a single library. to be “alpha” quality. But even in https://www.spacedrive.com/

Command-line access study, or even to contribute anything more than a couple

internetarchive
of files. To help with this, the Internet Archive publishes
its own set of open source command-line tools, inter-
netarchive, installable either through Python’s pip or as a
directly executable binary. This binary interacts with the
ather than being an ar- 808 billion pages archived so far in Internet Archive’s own API, and you can use it to perform

R chive used mostly for his-


torical study, the Internet
Archive has become the backbone
2023 alone, all accessible through
your humble web browser.
But the web isn’t always the
almost any of the same tasks you can accomplish with a
keyboard, mouse, and web browser, only from the conve-
nience of your terminal. It’s especially good for automa-
of the contemporary Internet. It best place for serious research, tion because it can retrieve JSON-formatted metadata
often offers unfettered access to for entries, and to allow the bulk edit-
otherwise restricted, geo-locked, ing and uploading of modified meta-
or paywalled content and is com- data. You can also download specific
mitted to maintaining the unedited items from the archives, such as files
ramblings of all-too-spontaneous linked to a page with a certain file type,
social media interactions. These or even download an entire collection.
are now fundamental to our free- There’s also an option to generate
doms online, and it’s often the files, such as ePub books, “on the fly”
snapshots held by the Internet Ar- when they’re typically created when
chive that keep people account- someone clicks the option on the site.
able, while also providing a snap- It can save a lot of time, and help you
shot of online life in what will be- avoid the distractions of browsing
come a great transition for hu- away from whatever you were study-
mankind. The Internet Archive it- ing to look at a collection of classic
self is a non-profit organization Amiga games.
committed to making all of this
available for free, forever. And it’s Interact with the Internet Archive from your command line with a Project Website
always storing the web, with over selection of open source tools published officially by the project. https://github.com/jjjake/internetarchive/

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 85


LINUX VOICE FOSSPICKS

Circuit designer

LibrePCB 1.0.0
inux and open source excel editor and a board editor. As with

L at nurturing software for


specialized interests. We’re
inundated by esoteric synthesiz-
similar applications, the schemat-
ics editor is for the circuit design,
while the board editor lets you
ers, domain-specific programming modify the layout of the circuit on
languages, and desktop applica- a board ready for pricing. The two
tions for all kinds of diversions. are kept synchronized and fea-
One particularly well-appointed di- ture rule checking, multiple layers,
version is the design of printed cir- and an easy drag-and-drop inter-
cuit Boards (PCB), which we’ve face that can switch between var- One of the best things about LibrePCB is the fabulous documenta-
looked at in KiCad 7, QElectroTech, ious devices or footprints. There’s tion, which includes a brilliant on-boarding tutorial.
Horizon EDA, and even the logic graphical acceleration and a use-
simulator BOOLR. We can now ful 3D view for the circuit design. dependencies and file can be. Combine this with
add to these LibrePCB, another ex- The most important part, how- paths. If you install a library the library editor for adding
cellent desktop PCB design tool ever, is the library for importing with a dependency on an- your own components, and
that is particularly committed to symbols, footprints, and pre-de- other library, that library and you have a fantastic pack-
being open source and easy to signed components. Library man- its own dependencies will age for all-in-one PCB de-
use. Under development since agement is significantly cleaner in also be installed. If you’ve sign that even features its
2013, LibrePCB is built with C++ LibrePCB than with other projects, struggled through Arduino own fabrication service for
and Qt. It’s quick, accessible, good firstly by using the same file for- platform libraries, as well painless PCB ordering.
looking, and very capable. mat across the entire application, as where those libraries
There are two main views to regardless of the type of library, might be installed, you’ll ap- Project Website
the application: a schematics and also in the way it handles preciate how difficult this https://librepcb.org

Filesystem navigator

nav
fter you’ve learned the several of the same arguments.

A basics of the ls and cd


commands, navigating
the filesystem from a terminal is
After launching nav, you enter an
interactive terminal-based file
directory navigator. The arrow
straightforward. But it can also keys can now be used to move
be a little labor intensive as you up and down the contents of the
cd into a directory and ls to view local directory, with Enter to
what’s in there before jumping open a directory or quit and re-
to another location or checking turn to the current location. Re- Define a function such as cd "$(nav --pipe "$@")" to use nav
the contents of other directories turning to the current location to navigate and switch to the selected directory.
to find what you’re looking for. means outputting the path to
Tab completion, interactive his- the standard output, which is in- to follow symbolic links. There
tory, and fuzzy search can all be tended then to be piped into are a few more shortcuts for re-
added and can help massively, whatever you need the path for. turning relative links and an in-
but they don’t change the core This could be an editor or a teractive help screen. By keep-
experience. This is something media player, for instance, or ing things simple nav feels like a
that nav attempts to do, by re- any other command requiring a great upgrade over ls, especially
placing ls and cd with an inter- path as an input. You can also if you’re new to the command
active filesystem navigator to use the nav interface to select line or can never remember
help you find whatever you’re multiple locations, or files, where you stored things.
looking for. which are then output as a list.
As a single binary, nav can re- Within nav, you can search, Project Website
place ls with an alias and takes show hidden files, and choose https://github.com/dkaslovsky/nav

86 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


FOSSPICKS LINUX VOICE

File synchronization

Celeste
hether it’s local, LAN, Nextcloud in between. But the

W or server-based, stor-
age is now cheaper
than ever. But we’re also gener-
best thing about rclone is that
it’s been around long enough to
be trusted. If only it wasn’t a
ating more data than ever, and command-line tool.
the two seem to cancel each Celeste is the answer. It’s a
other out. It’s tempting to stick beautiful, minimal graphical ap-
with the default media backup plication that’s been developed
services offered by Amazon, to synchronize a local location
Google, and Apple, but that to a remote location and back. Celeste has been written in Rust and is proud of how fast it runs,
means putting your trust and The GUI lists servers on the left regardless of the desktop environment.
privacy in their hands. Unless and files and directories on the
you’re a sys admin, there isn’t right, with a status icon for each includes Dropbox, Google Drive, Nextcloud, Proton
an easy solution to manage location to show which are Drive, and WebDAV. This power and capability comes
this locally. One of the best being updated. It handles the from using rclone as the back end, which is a good
tools for backup, for example, is complexity of excluding specific thing. It means that while Celeste itself remains under
rclone. This is a command-line files and dealing with conflicts heavy development and is still considered an alpha re-
tool that can synchronize one when something changes. It lease, its file synchronization and backup can be
location to another, with sup- can do this while connecting to trusted, at least for collections you’re happy to clone to
port for dozens of different several cloud providers at the more than one other location.
storage locations, from Ama- same time. The cloud provider
zon to WebDAV, with local files, list isn’t currently as compre- Project Website
the Internet Archive, SFTP, and hensive as rclone’s, but it still https://github.com/hwittenborn/celeste

File encryption

Cryptomator
aking sure your files codebase that’s been indepen-

M are backed up is one


thing. Making sure
they’re secure is quite another.
dently audited.
Cryptomator is a cross-plat-
form desktop application that
This is especially true when your will encrypt your data by first
backups are stored in the cloud creating a virtual vault and then
because you’re trusting the cloud by letting you unlock the vault at
provider to both not peek into any time to add, remove, or see
your files and also to have rigor- files inside the vault. You’re While Cryptomator is definitely open source, certain features such
ous access control. As a user, guided through every step of as the dark mode can only be unlocked from within the official bina-
both of these are impossible to this process, from creating the ries after you’ve made a financial contribution to the project.
know for certain. That leaves the vault to entering a passphrase.
best solution to be something You can create more than one you in direct control over your data, unlike similar vault-
you can directly control, which vault, and the vaults can be like systems in KDE Plasma or even macOS, where the
inevitably means encrypting stored locally or on any cloud integration could become a security risk. That Crypto-
things yourself. Similar to platform with local synchroniza- mator defaults to using whatever remote storage you
backup, there are many open tion support, including Dropbox, have access to is also a huge advantage, and this also
source encryption options, but Google, OneDrive, and Next- enables you to make the most of its cross-platform
the best will be something sim- cloud. Unless you choose to compatibility, because you can access the same vaults
ple and secure. Cryptomator is a trust your desktop’s password from multiple locations and operating systems.
strong candidate for being the manager, this passphrase will
best. It’s an easy-to-use tool with need to be entered whenever Project Website
commercial ambitions and a you access the vault. This puts https://cryptomator.org

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 87


LINUX VOICE FOSSPICKS

Music workstation

Ardour 8
t’s fantastic being able to values for each note

I write about a major Ardour


release every year. It’s a sign
that the project is flourishing,
have always been ed-
itable, but only by se-
lecting each note in-
both with its modest financial dividually. Velocity is
support and with the develop- now shown in its
ment efforts that go into each re- own “lane” beneath
lease. When there’s so much dis- the notes, letting you
cussion about how open source still edit individually
projects can fund themselves, or drag the cursor
Ardour is a great example of across to change The clip view was the major new addition in the previous release, augmented in
what can be accomplished with multiple values at Ardour 8 by support for Launchpad Pro hardware.
binary downloads behind a sub- once. The lollipop
scription model while remaining sticks will adjust themselves ac- The best new creative Ardour continues to be-
100-percent open source. The re- cordingly. You can also finally features, however, are come easier to use and
lease of Ardour 8 also feels like draw automation curves free- thanks to a third-party more intuitive. You can
an inflection point in the project’s hand by dragging the mouse contribution using Lua to now select more than one
own development trajectory be- across an automation lane, create three new arpeggia- channel to create a “quick
cause it’s the first major release rather than clicking through each tors. An arpeggiator is a group,” for instance, so that
with a creative bias rather than a point individually. This makes classic note-generating any control you now move
productive one. This means controlling things such as a filter tool that will create varia- will affect all selected
most of its new features are tar- cutoff or modulation much more tions of the notes you channels. This is very use-
geted at the creative stage or intuitive while still retaining the input. Enter the C, E, and G ful for small changes, and
compositional stages of the mu- sample-accurate interpolated au- notes for a C-major chord, you can still create formal
sic-making process, rather than tomation integrated into Ardour. for example, and an arpeg- track or bus groups to en-
the later production or mastering MIDI tracks now have note giator will trigger them in sure many channels are
stages. And even more impor- names in the note matrix using rising, descending, or ran- processed with the same
tant, they target MIDI note data the MIDNAM standard. This is dom orders. Ardour’s ar- signal path. Similarly, you
rather than the audio data, which particularly useful for drum peggiators can do this, but can select more than one
has traditionally been Ardour’s tracks when the labels are used they can also add rhyth- region in a clip or recording
target. to show which notes trigger mic accents to notes by and group these together,
A great example of Ardour’s which drum sounds, but they’re adjusting their velocity much like you might with
new creativity is being able to also handy when you use scales when they synchronize the elements of a diagram
use “lollipops” to edit the velocity that differ from the standard with the current time sig- in Inkscape. The project
values for MIDI notes. Velocity 12-TET. nature. However it’s the tempo can now be ad-
random arpeggiator that is justed dynamically too,
the most fun to play with. without the rigidity of a
This offers control over fixed grid or click track.
harmonic content and can This works by manually
generate all kinds of inter- dragging lines to the points
esting output that you can in a recording that you
use to serendipitously in- know are timed to hit a
corporate into your own specific point. These
music. If this isn’t enough, points will then align
Ardour now includes an al- across all of your tracks,
gorithmic composition ar- regardless of how well
peggiator called “Raptor.” timed their recordings
This includes note filters, were. It’s a brilliant addition
conditions for output to an application that con-
notes, limits, pitch track- tinues to go from strength
ing, and a totally unique to strength.
sound of its own.
Ardour 8 is now one of the best digital audio and MIDI applications you can install The production stage Project Website
on any platform, at any cost. hasn’t been ignored either. https://ardour.org

88 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


FOSSPICKS LINUX VOICE

HPL games engine

Amnesia: The Dark


Descent Redux
he Linux gaming land- survival horror games.

T scape was very different


back in 2007. We were
still in the “optimistically hoping
At the heart of their success is
the HPL (H. P. Lovecraft) 3D
games engine, which creates a
for a miracle” phase, pleading realistic and immersive physi-
with AAA game publishers to cally modeled environment in
cross-port their titles to Linux. A which to set the games. The
few of them had, most notably code for HPL Engine 1 was re-
Unreal Tournament 2004, but by leased as open source in 2010,
2007 many of our hopes lay with and Frictional generously did the If you already own Amnesia: The Dark Descent, a new implementa-
Linux Game Publishing conver- same for HPL Engine 2 in 2020. tion of its games engine helps the game run on modern hardware.
sions and CodeWeavers and This means that people can
their Wine compatibility hacks. study and re-implement those with additions to provide modern features such as resiz-
Enter Frictional Games. An en- engines to keep what are becom- ing the main window, better performance and occlusion
tirely new games company ing genuine classics running on mapping, and a to-do list that includes replacing the en-
founded by people entirely new modern hardware. This is exactly tire Newton Game Dynamics physics engine. You still
to gaming. Their first games what the Amnesia: The Dark De- need the original assets, because these were never re-
were the Penumbra trilogy, with scent Redux project has done. leased, but it’s a great way to replay a genuine classic,
each title released natively for It’s a rework of the original en- and hopefully, to play all of Frictional’s modern classics
Linux alongside the macOS and gine to use Vulkan to play one of for a long time to come.
Windows versions. Frictional has Frictional’s best games, Amne-
since become hugely successful sia: The Dark Descent. It tracks Project Website
with their brand of first-person the development of the original https://github.com/OSS-Cosmic/AmnesiaTheDarkDescent

Strategy game

Zatikon
atikon promises to be move, and range attributes,

Z “chess evolved.” At first


glance, it certainly looks
the part. The game is played on a
alongside a special power for the
majority of units. Special powers
include being able to jump, heal,
11x11 checkerboard with pieces summon imps, or deploy wolves,
that look like chess pieces. and they deeply affect your
These pieces are your army strategy.
units, and you take turns to move You can play alone, against While the graphics may look austere, the combination of chess strat-
them across the board in an at- someone online, or cooperatively, egy with Magic-style deck building in Zatikon feels very modern.
tempt to capture the opposing and there’s a handy in-game tuto-
enemy’s castle. Like chess, a unit rial to help you get started. In with the developer and asked whether the game could be
can only move in a certain way, these ways, playing Zatikon is made open source. It’s a question that hundreds of com-
but unlike chess, you get to like a combination of chess, turn- mercial projects have been asked, but it’s one that
choose which units you start the based strategy, resource man- Chronic Logic enthusiastically got behind, working hard
game with. Units can be bought agement, and deck building, and on the code to enable an AGPL 3.0 release. This is now
with gold earned from previous it’s a lot of fun. What’s more re- available for you to build or install from the Flatpak. If
battles, and you buy your own markable is that, until very re- you’ve never played the game before, it’s a brilliant oppor-
units to construct an army. cently, the game was a commer- tunity to play something battle tested by the most critical
There are over 100 different cial enterprise published by of players – paying customers.
units, each with their own price Chronic Logic. The open source
and capabilities. Those capabili- release only happened after an Project Website
ties include life, power, armor, ambitious player got in touch https://github.com/zatikon/zatikon

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 89


LINUX VOICE TUTORIAL – WAYDROID

Run your Android apps on Linux

Swapping Places
Waydroid brings Android apps to the Linux desktop in a simple and effective way.
BY HARALD JELE mulators can be used to run applications In the first step, if not already present, you need

E from different operating systems in vari-


ous constellations on Linux. The best-
known candidates include Wine (Windows), DOS-
to install two Ubuntu packages needed later (List-
ing 1, line 1). Then add the project’s official reposi-
tory to the local software sources (line 2); this will
Box (DOS), and SNES (Nintendo games). But a keep you up-to-date in the future. These sources
counterpart for Android has been a long time are used to install the current version of the appli-
coming, despite the clear proximity between the cation (line 3) later.
two systems. The current Android kernel is de- With this step, the required program compo-
rived from a Linux kernel with long-term support nents will now already exist in your Linux setup.
(LTS). Despite many patches, there are basically Finally, you need to tell Ubuntu’s system and ses-
more similarities between Android and Linux than sion manager (systemd) to automatically start
differences. Having said this, running Android the Waydroid container at operating system boot
applications natively on Linux is complex and in- time (line 4).
volves some tricky detailed work [1].
The makers of the free Waydroid [2] set them- First Launch
selves the task of integrating Android apps into When booting, Waydroid explores its configura-
the Linux universe as easily and flexibly as possi- tion and determines prior to the initial launch
ble. When doing so, they relied on a proven ap- that an Android instance has not yet been added.
proach and avoided reinventing the wheel. Anbox It then displays a graphical prompt, asking you
took a very similar path as early as in 2017, but to choose one of the two available instances
the developers failed to follow up with a useful (Figure 1).
product. Anbox development was eventually You can choose either the Google-free VANILLA
discontinued in 2023. version or GAPPS for seamless integration with the
Waydroid, like Anbox, is based on a container Googleverse. If you change your mind later, type
solution inside of which a session manager init at the command line to instruct Waydroid to
mounts and then launches an Android image. load the other image and prepare it for mounting
There are currently two images available, one with and booting (Listing 1, line 5).
the central Google apps (GAPPS) and one without However, Waydroid can only run one Android
them (VANILLA). Both are descendants of Linea- session inside a container so far. It makes sense
geOS and are equivalent to an Android 11. They to rename the previously loaded image before
can be updated on the fly by an integrated update overwriting it by downloading the other one. The
mechanism. images for a Waydroid session reside in the /var/
lib/waydroid/images/ directory and are named
Installation system.img and vendor.img. You will want to re-
You can install the current Waydroid v1.4.1 on name these two files to keep them safe if you
Ubuntu 22.04 LTS with just a few steps. The proj- make any changes.
ect website describe the details of the easy-to-fol-
low procedure [3].

Listing 1: Waydroid Setup


01 $ sudo apt install curl ca-certificates -y

02 $ curl https://repo.waydro.id | sudo bash

03 $ sudo apt install waydroid wl-clipboard -y

04 $ sudo systemctl enable --now waydroid-container


Figure 1: During the install, you need to select the Android
05 $ sudo waydroid init -s SYSTEM_TYPE <Image>
image you want to use.

90 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


TUTORIAL – WAYDROID LINUX VOICE

Figure 2: Android apps min-


Using Applications Your options for launching Android apps in- gling with native Linux apps
You are now ready to launch some initial Android clude Google Play and the Google settings (Set- in the Ubuntu program
apps. These apps blend in with the native Linux tings | Apps) like on a smartphone or tablet, calling launcher.
apps in the Ubuntu startup folder (Figure 2). the apps with Waydroid via the Ubuntu applica-
Once you have launched the GAPPS image, you tion launcher, or launching directly from a termi-
need to register it with Google (Google Play certifi- nal. There are specific deployment scenarios for
cation) to fully enjoy Google Play. To do this, type each of these options.
sudo waydroid shell to start a Waydroid shell. In You can use Google Play to install additional apps
the shell, you then need to run the less than user- if needed. F-Droid can also be integrated as an addi-
friendly command from line 1 of Listing 2 to dis- tional source in the usual way. On top of this, Way-
cover the device ID and register it with Google [4] droid provides an approach for setting up applica-
on https://www.google.com/android/uncertified. tions in APK file format directly (Listing 3, line 8).
Registration usually takes only a few seconds
after you sign into your Google account. How- Listing 2: Google Play Certification
ever, there are some posts on forums telling 01 $ ANDROID_RUNTIME_ROOT=/apex/com.android.runtime ANDROID_DATA=/data
you that the procedure can take up to a few ANDROID_TZDATA_ROOT=/apex/com.android.tzdata ANDROID_I18N_ROOT=/
minutes. After completing the registration, you apex/com.android.i18n sqlite3 /data/data/com.google.android.gsf/

need to restart the Waydroid session (Listing 2, databases/gservices.db "select * from main where name = \"android_
id\";"
lines 3 and 4).
02 [...]

In the Android Universe


03 $ waydroid session stop

04 $ waydroid session start


As mentioned before, the Ubuntu application
launcher shows you the icons of any Android
apps installed with the Google image alongside Listing 3: Launching Apps
those of the native installation. For an overview of 01 $ waydroid app list
the apps that have been installed, you can run the 02 [...]
waydroid app list command at the command 03 Name: Docs
line. You can also use the entries in this list to call 04 packageName: com.google.android.apps.docs.editors.docs
an Android application from the command line. 05 categories: android.intent.category.LAUNCHER
Lines 3 to 5 of Listing 3 show you an example of 06 [...]
this that references the entry for a Google Docs 07 $ waydroid app launch com.google.android.apps.docs.editors.docs
app. You can launch the app directly in the termi-
08 $ waydroid app install <App>.apk
nal with the command from line 7.

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 91


LINUX VOICE TUTORIAL – WAYDROID

Problems annoying problems is that rotating the display


When installing new apps, you are likely to no- causes some applications to trip over their
tice, sooner or later, that the Waydroid project toes. Not all apps are suitable for operation in
still has a few rough edges. One of the most landscape mode. This is basically not a pecu-
liarity of Waydroid, because apps like this will
also fail if you run them natively on a cell phone
or tablet.
What is annoying is the fact that you cannot
reach half of the display with the mouse, in this
case because Android only uses the width of
the portrait format. In previous versions of Way-
droid, this area simply remained black. In the
current release, Waydroid does display the area,
but it still cannot be used. Figure 3 shows a
problematic application with the mouse pointer
(which is very small in the figure) on the ex-
treme right edge of the accessible area (within
the red circle in Figure 3).
Basically, an application like this could be
brought in line by rotating manually. However,
manually changing the display geometry would
then also affect all other apps. The simplest solu-
tion is to use the commands in Listing 4 to enable
multi-window operation of the display, where all
applications are displayed in portrait mode by de-
fault (Figure 4).
Currently, problems can still be caused by
camera operations, the speakers, and the mi-
crophone. Waydroid is very keen on transpar-
ently passing the Linux standards through to

Listing 4: Multi-Window Mode


$ waydroid prop set persist.waydroid.multi_
windows true
Figure 3: An example of an application that cannot be used $ systemctl restart waydroid-container.service
in landscape mode.

Figure 4: In multi-window mode, Waydroid displays all the Android apps in portrait mode.

92 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


TUTORIAL – WAYDROID LINUX VOICE

the container. Nevertheless, it is still potluck as Conclusions


to whether the system responds properly to the The very flexible, open source Waydroid offers a
hardware setup. The developers have been put- useful approach to integrating Android applica-
ting a great deal of work into the camera ac- tions into a Linux installation. Apps can be set up
cess for quite some time, so there should be and used in the same way as on a smartphone.
some noticeable progress soon. wl-clipboard [6] offers a neat way of exchanging
data between the native Linux apps and the An-
Command Line droid apps in the container.
Waydroid can be fully controlled and configured Waydroid integrates well into the desktop of
using the command line. But because the soft- an Ubuntu installation; running Android apps is
ware works without any problems in many areas, more or less the same as running native Linux
many of these options remain more or less hid- apps. The project will very likely enable untrou-
den. If you do want to take a closer look at the bled access to the camera, microphone, and
services Waydroid offers to the outside world, speakers in the near future. This means that
you will find detailed information about them in there is nothing stopping you from using your
the Waydroid documentation [5]. For a first im- favorite apps from your smartphone or tablet
pression, you can try calling the Waydroid status on Linux. Q Q Q
report (Listing 5).
If there isn’t an active session in the container, Info
you can change this with waydroid session start,
[1] Common Android kernel:
while waydroid session stop does what it says on
https://source.android.com/docs/core/
the label. If you start an Android app with no ac-
architecture/kernel/android-common?hl=en
tive session, Waydroid automatically starts a
session with the app. [2] Waydroid: https://waydro.id
If there isn’t an active container, you can use [3] Installing Waydroid:
https://waydro.id/#install
sudo waydroid container <option> [4] Google Play certification.
https://docs.waydro.id/faq/google-play-
to change this. The options Waydroid accepts are
certification
start, stop, restart, freeze, and unfreeze. For the
inquisitive or anyone wanting to troubleshoot an [5] Waydroid command line:
issue, it is useful to take a look at the waydroid.log https://docs.waydro.id/usage/waydroid-
file in /var/lib/waydroid/. command-line-options
[6] wl-clipboard:
Listing 5: Waydroid Status https://github.com/bugaevc/wl-clipboard
$ waydroid status

Session: RUNNING The Author


Container: RUNNING
Harald Jele is a member of staff at the
Vendor type: MAINLINE
University of Klagenfurt. He stumbled across
IP address: 192168240112 Linux by happy coincidence in 1993 and has
Session user: admunix(1000) been using it on both servers and desktops
Wayland display: wayland-0 ever since.

QQQ

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 93


SERVICE
Back Issues

LINUX
NEWSSTAND
Order online:
https://bit.ly/Linux-Magazine-catalog

Linux Magazine is your guide to the world of Linux. Monthly issues are packed with advanced technical
articles and tutorials you won't find anywhere else. Explore our full catalog of back issues for specific
topics or to complete your collection.
#277/December 2023
Low-Code Tools
Experienced programmers are hard to find. Wouldn’t it be nice if subject matter experts and
occasional coders could create their own applications? The low-code revolution is all about
lowering the bar for programming knowledge. This month we show you some tools that let you
assemble an application using easy graphical building blocks.
On the DVD: MX Linux MX-23_x64 and Kali Linux 2023.3

#276/November 2023
ChatGPT on Linux
Everybody’s talking about ChatGPT, and ChatGPT is talking about everything. Sure you can
access the glib and versatile AI chatbot from a web interface, but think of the possibilities if you
tune in from the Linux command line.
On the DVD: Rocky Linux 9.2 and Debian 12.1

#275/October 2023
Think like an Intruder
The worst case scenario is when the attackers know more than you do about your network. If you
want to stay safe, learn the ways of the enemy. This month we give you a glimpse into the mind
of the attacker, with a close look at privilege escalation, reverse shells, and other intrusion
techniques.
On the DVD: AlmaLinux 8.2 and blendOS

#274/September 2023
The Best of Small Distros
Nowadays, all the attention is on big, enterprise distributions supported by professional
developers at big, enterprise corporations, but small distros are still a thing. If you’re shopping
for a Linux to run on old hardware, if you just want a simpler system that is more responsive
and less cluttered, or if you’re looking for a special Linux tailored for a special purpose, you’re
sure to find inspiration in our look at small and specialty Linux systems.
On the DVD: 10 Small Distro ISOs and 4 Small Distro Virtual Appliances

#273/August 2023
Podcasting
On the Internet, you don’t have to wait for permission to speak to the world. Podcasting lets you
connect with your audience no matter where they are. Whether you're in it to build community,
raise awareness about your skills, or just have some fun, the tools of the Linux environment
make it easy to take your first steps.
On the DVD: Linux Mint 21.1 Cinnamon and openSUSE Leap 15.5

#272/July 2023
Open Data
As long as governments have kept data, there have been people who have wanted to see it and
people who have wanted to control it. A new generation of tools, policies, and advocates seeks
to keep the data free, available, and in accessible formats. This month we bring you snapshots
from the quest for open data.
On the DVD: xubuntu 23.04 Desktop and Fedora 38 Workstation

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 95


SERVICE
Events

FEATURED EVENTS
Users, developers, and vendors meet at Linux events around the world.
We at Linux Magazine are proud to sponsor the Featured Events shown here.
For other events near you, check our extensive events calendar online at
https://www.linux-magazine.com/events.
If you know of another Linux event you would like us to add to our calendar,
please send a message with all the details to [email protected].

State of Open Con 2024 KickStart Europe FOSS Backstage


Date: February 6-7, 2024 Date: February 26-27, 2024 Date: March 4-5, 2024
Location: London, United Kingdom Location: Amsterdam, Netherlands Location: Berlin, Germany
Website: https://stateofopencon.com/ Website: https://www.kickstartconf.eu/ Website: https://24.foss-backstage.de/
OpenUK’s State of Open Con 2024 will KickStart Europe is the annual strategy What makes an open source project
take place at February 6-7 at The Brewery and networking conference on trends flourish? We want to encourage more
in London. Don't miss the UK’s Open and investments in tech and digital discourse about the non-coding
Technology Conference focused on Open infrastructure. By bringing together an aspects of successful open source
Source Software, Open Hardware, and array of industry professionals at the start projects. The sixth edition of FOSS
Open Data. Join us in London for our of the year, KickStart Europe helps to Backstage will take place in Berlin (and
outstanding content, amenities, and explore the emerging trends and online) on 4th and 5th March 2024.
delegate interactive experiences with technology shaping the digital industry Join us for two days of exciting talks
world-class speakers. and digital infrastructure of cloud, and discussions.
connectivity and data centers.

Events
FOSDEM Feb 3-4 Brussels, Belgium https://fosdem.org/

State of Open Con 24 Feb 6-7 London, United Kingdom https://stateofopencon.com/

DeveloperWeek SF Bay Area Feb 21-23 San Francisco, California https://www.developerweek.com/

KickStart Europe 2024 Feb 26-27 Amsterdam, Netherlands https://www.kickstartconf.eu/

Open Source Camp on Kubernetes Feb 27 Nürnberg, Germany https://opensourcecamp.de/

DeveloperWeek Live Online Feb 27-29 Virtual Event https://www.developerweek.com/

FOSS Backstage Mar 4-5 Berlin, Germany https://24.foss-backstage.de/

Energy HPC Conference Mar 5-7 Houston, Texas https://www.energyhpc.rice.edu/

SCaLE 21x Mar 14-17 Pasadena, California https://www.socallinuxexpo.org/scale/21x

CloudFest 2024 Mar 18-21 Europa-Park, Germany https://www.cloudfest.com/


Images © Alex White, 123RF.com

KubeCon + CloudNativeCon Europe Mar 19-22 Paris, France https://events.linuxfoundation.org/

php[tek] 2024 Apr 23-25 Rosemont, Illinois https://tek.phparch.com/

DrupalCon Portland 2024 May 6-9 Portland, Oregon https://events.drupal.org/portland2024

ISC 2024 May 12-16 Hamburg, Germany https://www.isc-hpc.com/

PyCon US 2024 May 15-23 Pittsburgh, Pennsylvania https://us.pycon.org/2024/

96 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM


SERVICE
Contact Info / Authors

Contact Info
WRITE FOR US
Editor in Chief Linux Magazine is looking for authors to write articles on Linux and the
Joe Casad, [email protected] tools of the Linux environment. We like articles on useful solutions that
Copy Editors
Amy Pettle, Aubrey Vaughn solve practical problems. The topic could be a desktop tool, a command-
News Editors line utility, a network monitoring application, a homegrown script, or
Jack Wallen, Amber Ankerholz anything else with the potential to save a Linux user trouble and time.
Editor Emerita Nomadica
Rita L Sooby
Our goal is to tell our readers stories they haven’t already heard, so we’re
Managing Editor especially interested in original fixes and hacks, new tools, and useful ap-
Lori White plications that our readers might not know about. We also love articles on
Localization & Translation
advanced uses for tools our readers do know about – stories that take a
Ian Travis
Layout traditional application and put it to work in a novel or creative way.
Dena Friesen, Lori White
We are currently seeking articles on the following topics for upcoming
Cover Design
Dena Friesen cover themes:
Cover Images
© Rewat Phungsamrong, 123RF.com • Open hardware
and Lexey111, fotolia.com • Linux boot tricks
Advertising
Brian Osborn, [email protected] • Best browser extensions
phone +49 8093 7679420
Let us know if you have ideas for articles on these themes, but keep in
Marketing Communications
Gwen Clark, [email protected] mind that our interests extend through the full range of Linux technical
Linux New Media USA, LLC topics, including:
4840 Bob Billings Parkway, Ste 104
Lawrence, KS 66049 USA
• Security
Publisher
Brian Osborn • Advanced Linux tuning and configuration
Customer Service / Subscription • Internet of Things
For USA and Canada:
Email: [email protected] • Networking
Phone: 1-866-247-2802 • Scripting
(Toll Free from the US and Canada)
• Artificial intelligence
For all other countries:
Email: [email protected] • Open protocols and open standards
www.linux-magazine.com
While every care has been taken in the content of the
If you have a worthy topic that isn’t on this list, try us out – we might be
magazine, the publishers cannot be held responsible interested!
for the accuracy of the information contained within
it or any consequences arising from the use of it. The Please don’t send us articles about products made by a company you
use of the disc provided with the magazine or any work for, unless it is an open source tool that is freely available to every-
material provided on it is at your own risk.
Copyright and Trademarks © 2023 Linux New Media
one. Don’t send us webzine-style “Top 10 Tips” articles or other superfi-
USA, LLC. cial treatments that leave all the work to the reader. We like complete so-
No material may be reproduced in any form lutions, with examples and lots of details. Go deep, not wide.
whatsoever in whole or in part without the written
permission of the publishers. It is assumed that all Describe your idea in 1-2 paragraphs and send it to: [email protected].
correspondence sent, for example, letters, email,
faxes, photographs, articles, drawings, are supplied Please indicate in the subject line that your message is an article proposal.
for publication or license to third parties on a non-
exclusive worldwide basis by Linux New Media USA,
LLC, unless otherwise stated in writing.
Linux is a trademark of Linus Torvalds.
Authors
All brand or product names are trademarks of their
respective owners. Contact us if we haven’t cred- Tom Alby 22 Sebastian Hilgenhof 16
ited your copyright; we will always correct any
oversight. Dave Allerton 69, 74 Dr. Harald Jele 90
Printed in Nuremberg, Germany by Kolibri Druck. Chris Binnie 38 Vincent Mealing 79
Distributed by Seymour Distribution Ltd, United
Kingdom
Zack Brown 12 Pete Metcalfe 54
Represented in Europe and other territories by: Rene Brunner 26
Sparkhaus Media GmbH, Bialasstr. 1a, 85625 Steffen Möller 16
Glonn, Germany.
Bruce Byfield 6, 32, 46
Graham Morrison 84
Linux Magazine (Print ISSN: 1471-5678, Online Joe Casad 3
ISSN: 2833-3950, USPS No: 347-942) is published Ali Imran Nagori 81
monthly by Linux New Media USA, LLC, and dis-
Mark Crutch 79
tributed in the USA by Asendia USA, 701 Ashland Adam Dix 65 Amy Pettle 36
Ave, Folcroft PA. Application to Mail at Periodicals
Christian Dreihsig 16 Mike Schilli 60
Postage Prices is pending at Philadelphia, PA and
additional mailing offices. POSTMASTER: send ad- Marco Fioretti 48 Jack Wallen 8
dress changes to Linux Magazine, 4840 Bob Billings
Parkway, Ste 104, Lawrence, KS 66049, USA. Jon “maddog” Hall 80 Malte Willert 16

LINUX-MAGAZINE.COM ISSUE 278 JANUARY 2024 97


NEXT MONTH
Issue 279
Available Starting
Issue 279 / February 2024
January 12

Intrusion
Detection
If intruders were on your network, would you
know it? Next month we show you how to
build an intrusion detection appliance using
a Raspberry Pi and the Suricata IDS tool.

Preview Newsletter
The Linux Magazine Preview is a monthly email
newsletter that gives you a sneak peek at the next
issue, including links to articles posted online.

Sign up at: https://bit.ly/Linux-Update

Image © Akaratee Nithipanmangkorn, 123RF.com

98 JANUARY 2024 ISSUE 278 LINUX-MAGAZINE.COM

You might also like