Linux Final Notes PDF
Linux Final Notes PDF
Linux Final Notes PDF
ON
LINUX POROGRAMMING
MCA II YEAR, I SEMESTER
(JNTUA-R17)
Mrs.B.VIJAYA
Asst.Professor
UNIT I
Linux Utilities-File handling utilities, Security by file permissions, Process utilities, Disk
utilities, Networking commands, Filters, Text processing utilities and Backup utilities, sed –
scripts, operation, addresses, commands, applications, awk – execution, fields and records,
scripts, operation, patterns, actions, functions, using system commands in awk.
UNIT II
Working with the Bourne again shell(bash): Introduction, shell responsibilities, pipes and
input Redirection, output redirection, here documents, running a shell script, the shell as a
programming language, shell meta characters, file name substitution, shell variables,
command substitution, shell commands, the environment, quoting, test command, control
structures, arithmetic in shell, shell script examples, interrupt processing, functions,
debugging shell scripts.
Linux Files: File Concept, File System Structure,Inodes, File types, The standard I/O (fopen,
fclose, fflush, fseek, fgetc, getc, getchar, fputc, putc, putchar, fgets, gets etc.), formatted I/O,
stream errors, kernel support for files, System calls, library functions, file descriptors, low
level file access - usage of open, creat, read, write, close, lseek, stat family, umask, dup,
dup2, fcntl, file and record locking. file and directory management - chmod, chown,
links(soft links & hard links - unlink, link, symlink), mkdir, rmdir, chdir, getcwd, Scanning
Directories-opendir, readdir, closedir,rewinddir, seekdir, telldir functions.
UNIT III
Linux Process – Process concept, Kernel support for process, process attributes, process
hierarchy,process states,process composition, process control - process creation, waiting for a
process, process termination, zombie process,orphan process, system call interface for
process management-fork, vfork, exit, wait, waitpid, exec family, system.Signals
UNI-IV
Semaphores-Kernel support for semaphores, Linux APIs for semaphores, file locking with
semaphores.
Shared Memory- Kernel support for shared memory, Linux APIs for shared memory,
semaphore and shared memory example.
UNIT V
Sockets: Introduction to Linux Sockets, Socket system calls for connection oriented protocol
and connectionless protocol, example-client/server programs.
REFERENCES:
UNIT-I
LINUX UTILITIES
Introduction to Linux
Linux is a Unix-like computer operating system assembled under the model of free
and open source software development and distribution. The defining component of Linux is
the Linux kernel, an operating system kernel first released 5 October 1991 by Linus Torvalds.
Linux was originally developed as a free operating system for Intel x86-based
personal computers. It has since been ported to more computer hardware platforms than any
other operating system. It is a leading operating system on servers and other big iron systems
such as mainframe computers and supercomputers more than 90% of today's 500 fastest
supercomputers run some variant of Linux, including the 10 fastest. Linux also runs on
embedded systems (devices where the operating system is typically built into the firmware
and highly tailored to the system) such as mobile phones, tablet computers, network routers,
televisions and video game consoles; the Android system in wide use on mobile devices is
built on the Linux kernel.
A distribution oriented toward desktop use will typically include the X Window
System and an accompanying desktop environment such as GNOME or KDE Plasma. Some
such distributions may include a less resource intensive desktop such as LXDE or Xfce for
use on older or less powerful computers. A distribution intended to run as a server may omit
all graphical environments from the standard install and instead include other software such
as the Apache HTTP Server and an SSH server such as OpenSSH. Because Linux is freely
redistributable, anyone may create a distribution for any intended use. Applications
commonly used with desktop Linux systems include the Mozilla Firefox web browser, the
LibreOffice office application suite, and the GIMP image editor.Since the main supporting
user space system tools and libraries originated in the GNU Project, initiated in 1983 by
Richard Stallman, the Free Software Foundation prefers the name GNU/Linux.
History of Unix
The Unix operating system was conceived and implemented in 1969 at AT&T's Bell
Laboratories in the United States by Ken Thompson, Dennis Ritchie, Douglas McIlroy, and
Joe Ossanna. It was first released in 1971 and was initially entirely written in assembly
language, a common practice at the time. Later, in a key pioneering approach in 1973, Unix
was re-written in the programming language C by Dennis Ritchie (with exceptions to the
kernel and I/O). The availability of an operating system written in a high-level language
allowed easier portability to different computer platforms.
Today, Linux systems are used in every domain, from embedded systems to
supercomputers, and have secured a place in server installations often using the popular
LAMP application stack. Use of Linux distributions in home and enterprise desktops has been
growing. They have also gained popularity with various local and national governments. The
federal government of Brazil is well known for its support for Linux. News of the Russian
military creating its own Linux distribution has also surfaced, and has come to fruition as the
G.H.ost Project. The Indian state of Kerala has gone to the extent of mandating that all state
high schools run Linux on their computers.
Design
A Linux-based system is a modular Unix-like operating system. It derives much of its
basic design from principles established in Unix during the 1970s and 1980s. Such a system
uses a monolithic kernel, the Linux kernel, which handles process control, networking, and
peripheral and file system access. Device drivers are either integrated directly with the kernel
or added as modules loaded while the system is running.
Separate projects that interface with the kernel provide much of the system's higher-
level functionality. The GNU userland is an important part of most Linux-based systems,
providing the most common implementation of the C library, a popular shell, and many of the
common Unix tools which carry out many basic operating system tasks. The graphical user
interface (or GUI) used by most Linux systems is built on top of an implementation of the X
Window System.
Programming on Linux
Most Linux distributions support dozens of programming languages. The original
development tools used for building both Linux applications and operating system programs
are found within the GNU toolchain, which includes the GNU Compiler Collection (GCC)
and the GNU build system. Amongst others, GCC provides compilers for Ada, C, C++, Java,
and Fortran. First released in 2003, the Low Level Virtual Machine project provides an
alternative open-source compiler for many languages. Proprietary compilers for Linux include
the Intel C++ Compiler, Sun Studio, and IBM XL C/C++ Compiler. BASIC in the form of
Visual Basic is supported in such forms as Gambas, FreeBASIC, and XBasic.
Most distributions also include support for PHP, Perl, Ruby, Python and other
dynamic languages. While not as common, Linux also supports C# (via Mono), Vala, and
Scheme. A number of Java Virtual Machines and development kits run on Linux, including
the original Sun Microsystems JVM (HotSpot), and IBM's J2SE RE, as well as many open-
source projects like Kaffe and JikesRVM.
Linux Advantages
1. Low cost: You don’t need to spend time and money to obtain licenses since Linux and
much of its software come with the GNU General Public License. You can start to work
immediately without worrying that your software may stop working anytime because the
free trial version expires. Additionally, there are large repositories from which you can
freely download high quality software for almost any task you can think of.
2. Stability: Linux doesn’t need to be rebooted periodically to maintain performance levels.
It doesn’t freeze up or slow down over time due to memory leaks and such. Continuous
up-times of hundreds of days (up to a year or more) are not uncommon.
3. Performance: Linux provides persistent high performance on workstations and on
networks. It can handle unusually large numbers of users simultaneously, and can make
old computers sufficiently responsive to be useful again.
UNIX is copyrighted name only big companies are allowed to use the UNIX copyright and
name, so IBM AIX and Sun Solaris and HP-UX all are UNIX operating systems. The Open
Group holds the UNIX trademark in trust for the industry, and manages the UNIX trademark
licensing program.
Linux is just a kernel. All Linux distributions includes GUI system + GNU utilities
(such as cp, mv, ls,date, bash etc) + installation & management tools + GNU c/c++ Compilers
+ Editors (vi) + and various applications (such as OpenOffice, Firefox). However, most
UNIX operating systems are considered as a complete operating system as everything come
from a single source or vendor.
As I said earlier Linux is just a kernel and Linux distribution makes it complete usable
operating systems by adding various applications. Most UNIX operating systems comes with
A-Z programs such as editor, compilers etc. For example HP-UX or Solaris comes with A-Z
programs.
License and cost
Linux is Free (as in beer [freedom]). You can download it from the Internet or
redistribute it under GNU licenses. You will see the best community support for Linux. Most
UNIX like operating systems are not free (but this is changing fast, for example OpenSolaris
UNIX). However, some Linux distributions such as Redhat / Novell provides additional
Linux support, consultancy, bug fixing, and training for additional fees.
User-Friendly
Linux is considered as most user friendly UNIX like operating systems. It makes it
easy to install sound card, flash players, and other desktop goodies. However, Apple OS X is
most popular UNIX operating system for desktop usage.
Security Firewall Software
Linux comes with open source netfilter/iptables based firewall tool to protect your
server and desktop from the crackers and hackers. UNIX operating systems comes with its
own firewall product (for example Solaris UNIX comes with ipfilter based firewall) or you
need to purchase a 3rd party software such as Checkpoint UNIX firewall.
Backup and Recovery Software
UNIX and Linux comes with different set of tools for backing up data to tape and
other backup media. However, both of them share some common tools such as tar,
dump/restore, and cpio etc.
File Systems
▪ Linux by default supports and use ext3 or ext4 file systems.
▪ UNIX comes with various file systems such as jfs, gpfs (AIX), jfs, gpfs (HP-UX), jfs,
gpfs (Solaris).
System Administration Tools
1. UNIX comes with its own tools such as SAM on HP-UX.
2. Suse Linux comes with Yast
3. Redhat Linux comes with its own gui tools called redhat-config-*.
However, editing text config file and typing commands are most popular options for sys
admin work under UNIX and Linux.
System Startup Scripts
Almost every version of UNIX and Linux comes with system initialization script but they are
located in different directories:
1. HP-UX - /sbin/init.d
2. AIX - /etc/rc.d/init.d
3. Linux - /etc/init.d
You can look at this from both sides of the fence. Some say giving the public access to
the code opens the operating system (and the software that runs on top of it) to malicious
developers who will take advantage of any weakness they find. Others say that having full
access to the code helps bring about faster improvements and bug fixes to keep those
malicious developers from being able to bring the system down. I have, on occasion, dipped
into the code of one Linux application or another, and when all was said and done, was happy
with the results. Could I have done that with a closed-source Windows application? No.
#2: Licensing freedom vs. licensing restrictions
Along with access comes the difference between the licenses. I’m sure that every IT
professional could go on and on about licensing of PC software. But let’s just look at the key
aspect of the licenses (without getting into legalese). With a Linux GPL-licensed operating
system, you are free to modify that software and use and even republish or sell it (so long as
you make the code available). Also, with the GPL, you can download a single copy of a
Linux distribution (or application) and install it on as many machines as you like. With the
Microsoft license, you can do none of the above. You are bound to the number of licenses you
purchase, so if you purchase 10 licenses, you can legally install that operating system (or
application) on only 10 machines.
#3: Online peer support vs. paid help-desk support
This is one issue where most companies turn their backs on Linux. But it’s really not
necessary. With Linux, you have the support of a huge community via forums, online search,
and plenty of dedicated Web sites. And of course, if you feel the need, you can purchase
support contracts from some of the bigger Linux companies (Red Hat and Novell for
instance).
However, when you use the peer support inherent in Linux, you do fall prey to time.
You could have an issue with something, send out e-mail to a mailing list or post on a forum,
and within 10 minutes be flooded with suggestions. Or these suggestions could take hours of
days to come in. It seems all up to chance sometimes. Still, generally speaking, most
problems with Linux have been encountered and documented. So chances are good you’ll
find your solution fairly quickly.
On the other side of the coin is support for Windows. Yes, you can go the same route
with Microsoft and depend upon your peers for solutions. There are just as many help
sites/lists/forums for Windows as there are for Linux. And you can purchase support from
Microsoft itself. Most corporate higher-ups easily fall victim to the safety net that having a
support contract brings. But most higher-ups haven’t had to depend up on said support
contract. Of the various people I know who have used either a Linux paid support contract or
a Microsoft paid support contract, I can’t say one was more pleased than the other. This of
course begs the question “Why do so many say that Microsoft support is superior to Linux
paid support?”
#4: Full vs. partial hardware support
One issue that is slowly becoming nonexistent is hardware support. Years ago, if you
wanted to install Linux on a machine you had to make sure you hand-picked each piece of
hardware or your installation would not work 100 percent. I can remember, back in 1997-ish,
trying to figure out why I couldn’t get Caldera Linux or Red Hat Linux to see my modem.
After much looking around, I found I was the proud owner of a Winmodem. So I had to go
out and purchase a US Robotics external modem because that was the one modem
I knew would work. This is not so much the case now. You can grab a PC (or laptop) and
most likely get one or more Linux distributions to install and work nearly 100 percent. But
there are still some exceptions. For instance, hibernate/suspend remains a problem with many
laptops, although it has come a long way.
With Windows, you know that most every piece of hardware will work with the
operating system. Of course, there are times (and I have experienced this over and over) when
you will wind up spending much of the day searching for the correct drivers for that piece of
hardware you no longer have the install disk for. But you can go out and buy that 10-cent
Ethernet card and know it’ll work on your machine (so long as you have, or can find, the
drivers). You also can rest assured that when you purchase that insanely powerful graphics
card, you will probably be able to take full advantage of its power.
#5: Command line vs. no command line
No matter how far the Linux operating system has come and how amazing the desktop
environment becomes, the command line will always be an invaluable tool for administration
purposes. Nothing will ever replace my favorite text-based editor, ssh, and any given
command-line tool. I can’t imagine administering a Linux machine without the command
line. But for the end user — not so much. You could use a Linux machine for years and never
touch the command line. Same with Windows. You can still use the command line with
Windows, but not nearly to the extent as with Linux. And Microsoft tends to obfuscate the
command prompt from users. Without going to Run and entering cmd (or command, or
whichever it is these days), the user won’t even know the command-line tool exists. And if a
user does get the Windows command line up and running, how useful is it really?
#6: Centralized vs. noncentralized application installation
The heading for this point might have thrown you for a loop. But let’s think about this
for a second. With Linux you have (with nearly every distribution) a centralized location
where you can search for, add, or remove software. I’m talking about package management
systems, such as Synaptic. With Synaptic, you can open up one tool, search for an application
(or group of applications), and install that application without having to do any Web
searching (or purchasing).
Windows has nothing like this. With Windows, you must know where to find the
software you want to install, download the software (or put the CD into your machine), and
run setup.exe or install.exe with a simple double-click. For many years, it was thought that
installing applications on Windows was far easier than on Linux. And for many years, that
thought was right on target. Not so much now. Installation under Linux is simple, painless,
and centralized.
#7: Flexibility vs. rigidity
I always compare Linux (especially the desktop) and Windows to a room where the
floor and ceiling are either movable or not. With Linux, you have a room where the floor and
ceiling can be raised or lowered, at will, as high or low as you want to make them. With
Windows, that floor and ceiling are immovable. You can’t go further than Microsoft has
deemed it necessary to go.
Take, for instance, the desktop. Unless you are willing to pay for and install a third-
party application that can alter the desktop appearance, with Windows you are stuck with
what Microsoft has declared is the ideal desktop for you. With Linux, you can pretty much
make your desktop look and feel exactly how you want/need. You can have as much or as
little on your desktop as you want. From simple flat Fluxbox to a full-blown 3D Compiz
experience, the Linux desktop is as flexible an environment as there is on a computer.
#8: Fanboys vs. corporate types
I wanted to add this because even though Linux has reached well beyond its school-
project roots, Linux users tend to be soapbox-dwelling fanatics who are quick to spout off
about why you should be choosing Linux over Windows. I am guilty of this on a daily basis (I
try hard to recruit new fanboys/girls), and it’s a badge I wear proudly. Of course, this is seen
as less than professional by some. After all, why would something worthy of a corporate
environment have or need cheerleaders? Shouldn’t the software sell itself? Because of the
open source nature of Linux, it has to make do without the help of the marketing budgets and
deep pockets of Microsoft. With that comes the need for fans to help spread the word. And
word of mouth is the best friend of Linux.
Some see the fanaticism as the same college-level hoorah that keeps Linux in the
basements for LUG meetings and science projects. But I beg to differ. Another company,
thanks to the phenomenon of a simple music player and phone, has fallen into the same
fanboy fanaticism, and yet that company’s image has not been besmirched because of that
fanaticism. Windows does not have these same fans. Instead, Windows has a league of paper-
certified administrators who believe the hype when they hear the misrepresented market share
numbers reassuring them they will be employable until the end of time.
#9: Automated vs. nonautomated removable media
I remember the days of old when you had to mount your floppy to use it and unmount
it to remove it. Well, those times are drawing to a close — but not completely. One issue that
plagues new Linux users is how removable media is used. The idea of having to manually
“mount” a CD drive to access the contents of a CD is completely foreign to new users. There
is a reason this is the way it is. Because Linux has always been a multiuser platform, it was
thought that forcing a user to mount a media to use it would keep the user’s files from being
overwritten by another user. Think about it: On a multiuser system, if everyone had instant
access to a disk that had been inserted, what would stop them from deleting or overwriting a
file you had just added to the media? Things have now evolved to the point where Linux
subsystems are set up so that you can use a removable device in the same way you use them
in Windows. But it’s not the norm. And besides, who doesn’t want to manually edit
the /etc/fstab fle?
#10: Multilayered run levels vs. a single-layered run level
I couldn’t figure out how best to title this point, so I went with a description. What I’m
talking about is Linux’ inherent ability to stop at different run levels. With this, you can work
from either the command line (run level 3) or the GUI (run level 5). This can really save your
socks when X Windows is fubared and you need to figure out the problem. You can do this
by booting into run level 3, logging in as root, and finding/fixing the problem.
With Windows, you’re lucky to get to a command line via safe mode — and then you may or
may not have the tools you need to fix the problem. In Linux, even in run level 3, you can still
get and install a tool to help you out (hello apt-get install APPLICATION via the command
line). Having different run levels is helpful in another way. Say the machine in question is a
Web or mail server. You want to give it all the memory you have, so you don’t want the
machine to boot into run level 5. However, there are times when you do want the GUI for
administrative purposes (even though you can fully administer a Linux server from the
command line). Because you can run the startxcommand from the command line at run level
3, you can still start up X Windows and have your GUI as well. With Windows, you are stuck
at the Graphical run level unless you hit a serious problem.
cat Command:
cat linux command concatenates files and print it on the standard output.
SYNTAX:
The Syntax is
cat [OPTIONS] [FILE]...
OPTIONS:
-A Show all.
-b Omits line numbers for blank space in the output.
-e A $ character will be printed at the end of each line prior to a new line.
-E Displays a $ (dollar sign) at the end of each line.
-n Line numbers for all the output lines.
-s If the output has multiple empty lines it replaces it with one empty line.
-T Displays the tab characters in the output.
Non-printing characters (with the exception of tabs, new-lines and form-
-v
feeds) are printed visibly.
EXAMPLE:
The above cat command will concatenate the two files (file1.txt and file2.txt) and it
will display the output in the screen. Some times the output may not fit the monitor
screen. In such situation you can print those files in a new file or display the file using
less command.
cat file1.txt file2.txt | less
5. To concatenate several files and to transfer the output to another file.
cat file1.txt file2.txt > file3.txt
In the above example the output is redirected to new file file3.txt. The cat command
will create new file file3.txt and store the concatenated output into file3.txt.
rm Command:
rm linux command is used to remove/delete the file from the directory.
SYNTAX:
The Syntax is
s rm [options..] [file | directory]
OPTIONS:
SYNTAX:
The Syntax is
cd [directory | ~ | ./ | ../ | - ]
OPTIONS:
1. cd linux-command
This command will take you to the sub-directory(linux-command) from its parent
directory.
2. cd ..
This will change to the parent-directory from the current working directory/sub-
directory.
3. cd ~
This command will move to the user's home directory which is "/home/username".
cp Command:
cp command copy files from one location to another. If the destination is an existing file,
then the file is overwritten; if the destination is an existing directory, the file is copied into the
directory (the directory is not overwritten).
SYNTAX:
The Syntax is
cp [OPTIONS]... SOURCE DEST
cp [OPTIONS]... SOURCE... DIRECTORY
cp [OPTIONS]... --target-directory=DIRECTORY SOURCE...
OPTIONS:
-a same as -dpR.
--backup[=CONTROL] make a backup of each existing destination file
-b like --backup but does not accept an argument.
if an existing destination file cannot be opened, remove it and
-f
try again.
-p same as --preserve=mode,ownership,timestamps.
preserve the specified attributes (default:
--
mode,ownership,timestamps) and security contexts, if possible
preserve[=ATTR_LIST]
additional attributes: links, all.
--no-
don't preserve the specified attribute.
preserve=ATTR_LIST
--parents append source path to DIRECTORY.
EXAMPLE:
SYNTAX:
The Syntax is
ls [OPTIONS]... [FILE]
OPTIONS:
Lists all the files, directories and their mode, Number of links, owner of the
-l
file, file size, Modified date and time and filename.
-t Lists in order of last modification time.
-a Lists all entries including hidden files.
-d Lists directory files instead of contents.
-p Puts slash at the end of each directories.
-u List in order of last access time.
-i Display inode information.
-ltr List files order by date.
-lSr List files order by file size.
EXAMPLE:
7373080 child.gif
7373081 email.gif
7373076 indigo.gif
The above command displays filename with inode value.
ln Command:
ln command is used to create link to a file (or) directory. It helps to provide soft link for
desired files. Inode will be different for source and destination.
SYNTAX:
The Syntax is
ln [options] existingfile(or directory)name newfile(or directory)name
OPTIONS:
Link files without questioning the user, even if the mode of target forbids
-f
writing. This is the default if the standard input is not a terminal.
-n Does not overwrite existing files.
-s Used to create soft links.
EXAMPLE:
SYNTAX:
The Syntax is
mkdir [options] directories
OPTIONS:
1. Create directory:
mkdir test
The above command is used to create the directory 'test'.
2. Create directory and set permissions:
SYNTAX:
The Syntax is
rmdir [options..] Directory
OPTIONS:
Allow users to remove the directory dirname and its parent directories
-p
which become empty.
EXAMPLE:
1. To delete/remove a directory
rmdir tmp
rmdir command will remove/delete the directory tmp if the directory is empty.
2. To delete a directory tree:
rm -ir tmp
This command recursively removes the contents of all subdirectories of the tmp
directory, prompting you regarding the removal of each file, and then removes the tmp
directory itself.
mv Command:
mv command which is short for move. It is used to move/rename file from one directory to
another. mv command is different from cp command as it completely removes the file from
the source and moves to the directory specified, where cp command just copies the content
from one file to another.
SYNTAX:
The Syntax is
mv [-f] [-i] oldname newname
OPTIONS:
mv file1.txt file2.txt
This command renames file1.txt as file2.txt
2. To move a directory
mv hscripts tmp
In the above line mv command moves all the files, directories and sub-directories
from hscripts folder/directory to tmp directory if the tmp directory already exists. If
there is no tmp directory it rename's the hscripts directory as tmp directory.
3. To Move multiple files/More files into another directory
mv file1.txt tmp/file2.txt newdir
This command moves the files file1.txt from the current directory and file2.txt from
the tmp folder/directory to newdir.
diff Command:
diff command is used to find differences between two files.
SYNTAX:
The Syntax is
diff [options..] from-file to-file
OPTIONS:
Lets create two files file1.txt and file2.txt and let it have the following data.
Data in file1.txt Data in file2.txt
HIOX TEST HIOX TEST
hscripts.com HSCRIPTS.com
with friend ship with friend ship
hiox india
1. Compare files ignoring white space:
diff -w file1.txt file2.txt
This command will compare the file file1.txt with file2.txt ignoring white/blank space
and it will produce the following output.
2c2
< hscripts.com
---
> HSCRIPTS.com
4d3
< Hioxindia.com
2. Compare the files side by side, ignoring white space:
diff -by file1.txt file2.txt
This command will compare the files ignoring white/blank space, It is easier to
differentiate the files.
About wc
Short for word count, wc displays a count of lines, words, and characters in a file.
Syntax
wc [-c | -m | -C ] [-l] [-w] [ file ... ]
-c Count bytes.
-m Count characters.
-C Same as -m.
-l Count lines.
-a Use suffixlength letters to form the suffix portion of the filenames of the split
suffixlength file. If -a is not specified, the default suffix length is 2. If the sum of the name
operand and the suffixlength option-argument would create a filename
exceeding NAME_MAX bytes, an error will result; split will exit with a
diagnostic message and no files will be created.
File The path name of the ordinary file to be split. If no input file is given or file is
-, the standard input will be used.
name The prefix to be used for each of the files resulting from the split operation. If
no name argument is given, x will be used as the prefix of the output files.
The combined length of the basename of prefix and suffixlength cannot
exceed NAME_MAX bytes; see OPTIONS.
Examples
split -b 22 newfile.txt new - would split the file "newfile.txt" into three separate files called
newaa, newab and newac each file the size of 22.
split -l 300 file.txt new - would split the file "newfile.txt" into files beginning with the name
"new" each containing 300 lines of text each
About settime and touch
Change file access and modification time.
Syntax
touch [-a] [-c] [-m] [-r ref_file | -t time ] file
settime [ -f ref_file ] file
-a Change the access time of file. Do not change the modification time unless -
m is also specified.
-c Do not create a specified file if it does not exist. Do not write any diagnostic
messages concerning this condition.
-m Change the modification time of file. Do not change the access time unless -a
is also specified.
-r ref_file Use the corresponding times of the file named by ref_file instead of the
current time.
-t time Use the specified time instead of the current time. time will be a decimal
number of the form:
[[CC]YY]MMDDhhmm [.SS]
MM - The month of the year [01-12].
DD - The day of the month [01-31].
hh - The hour of the day [00-23].
mm - The minute of the hour [00-59].
CC - The first two digits of the year.
YY - The second two digits of the year.
SS - The second of the minute [00-61].
-f ref_file Use the corresponding times of the file named by ref_file instead of the
current time.
File permissions
Linux uses the same permissions scheme as Linux. Each file and directory on your system is
assigned access rights for the owner of the file, the members of a group of related users, and
everybody else. Rights can be assigned to read a file, to write a file, and to execute a file (i.e.,
run the file as a program).
To see the permission settings for a file, we can use the ls command as follows:
Let's try another example. We will look at thebash program which is located in the /bin
directory:
In the diagram below, we see how the first portion of the listing is interpreted. It consists of a
character indicating the file type, followed by three sets of three characters that convey the
reading, writing and execution permission for the owner, group, and everybody else.
chmod
The chmod command is used to change the permissions of a file or directory. To use it, you
specify the desired permission settings and the file or files that you wish to modify. There are
two ways to specify the permissions, but I am only going to teach one way.
It is easy to think of the permission settings as a series of bits (which is how the computer
thinks about them). Here's how it works:
and so on...
Now, if you represent each of the three sets of permissions (owner, group, and other) as a
single digit, you have a pretty convenient way of expressing the possible permissions settings.
For example, if we wanted to set some_file to have read and write permission for the owner,
but wanted to keep the file private from others, we would:
CREC,Dept.Of MCA Page 23
Linux Programming
Here is a table of numbers that covers all the common settings. The ones beginning with "7"
are used with programs (since they enable execution) and the rest are for other kinds of files.
Value Meaning
(rwxr-xr-x) The file's owner may read, write, and execute the file. All others may
755 read and execute the file. This setting is common for programs that are used by all
users.
(rwx------) The file's owner may read, write, and execute the file. Nobody else has
700 any rights. This setting is useful for programs that only the owner may use and must
be kept private from others.
666 (rw-rw-rw-) All users may read and write the file.
(rw-r--r--) The owner may read and write a file, while all others may only read the
644 file. A common setting for data files that everybody may read, but only the owner
may change.
(rw-------) The owner may read and write a file. All others have no rights. A common
600
setting for data files that the owner wants to keep private.
Directory permissions
The chmod command can also be used to control the access permissions for directories. In
most ways, the permissions scheme for directories works the same way as they do with files.
However, the execution permission is used in a different way. It provides control for access to
file listing and other things. Here are some useful settings for directories:
Value Meaning
(rwxrwxrwx) No restrictions on permissions. Anybody may list files, create new files
777
in the directory and delete files in the directory. Generally not a good setting.
(rwxr-xr-x) The directory owner has full access. All others may list the directory, but
755 cannot create files nor delete them. This setting is common for directories that you
wish to share with other users.
(rwx------) The directory owner has full access. Nobody else has any rights. This
700 setting is useful for directories that only the owner may use and must be kept private
from others.
It is often useful to become the superuser to perform important system administration tasks,
but as you have been warned (and not just by me!), you should not stay logged on as the
superuser. In most distributions, there is a program that can give you temporary access to the
superuser's privileges. This program is called su(short for substitute user) and can be used in
those cases when you need to be the superuser for a small number of tasks. To become the
superuser, simply type the su command. You will be prompted for the superuser's password:
[me@linuxbox me]$ su
Password:
[root@linuxbox me]#
After executing the sucommand, you have a new shell session as the superuser. To exit the
superuser session, type exit and you will return to your previous session.
In some distributions, most notably Ubuntu, an alternate method is used. Rather than using
su, these systems employ the sudo command instead. With sudo, one or more users are
granted superuser privileges on an as needed basis. To execute a command as the superuser,
the desired command is simply preceeded with thesudo command. After the command is
entered, the user is prompted for the user's password rather than the superuser's:
You can change the owner of a file by using thechown command. Here's an example: Suppose
I wanted to change the owner ofsome_file from "me" to "you". I could:
[me@linuxbox me]$ su
Password:
[root@linuxbox me]# chown you some_file
[root@linuxbox me]# exit
[me@linuxbox me]$
Notice that in order to change the owner of a file, you must be the superuser. To do this, our
example employed the sucommand, then we executed chown, and finally we typed exit to
return to our previous session.
The group ownership of a file or directory may be changed with chgrp. This command is
used like this:
In the example above, we changed the group ownership of some_file from its previous
group to "new_group". You must be the owner of the file or directory to perform a chgrp.
chown Command:
chown command is used to change the owner / user of the file or directory. This is an
admin command, root user only can change the owner of a file or directory.
SYNTAX:
The Syntax is
chown [options] newowner filename/directoryname
OPTIONS:
Change the permission on files that are in the subdirectories of the directory
-R
that you are currently in.
-c Change the permission for each file.
Prevents chown from displaying error messages when it is unable to change
-f
the ownership of a file.
EXAMPLE:
Permission 000
SYNTAX:
The Syntax is
chmod [options] [MODE] FileName
File Permission
# File Permission
0 none
1 execute only
2 write only
3 write and execute
4 read only
5 read and execute
6 read and write
7 set all permissions
OPTIONS:
-c Displays names of only those files whose permissions are being changed
-f Suppress most error messages
-R Change files and directories recursively
-v Output version information and exit.
EXAMPLE:
chgrp Command:
chgrp command is used to change the group of the file or directory. This is an admin
command. Root user only can change the group of the file or directory.
SYNTAX:
The Syntax is
chgrp [options] newgroup filename/directoryname
OPTIONS:
Change the permission on files that are in the subdirectories of the directory
-R
that you are currently in.
Hioxindia.com <
EXAMPLE:
PROCESS UTILITIES:
ps COMMAND:
ps command is used to report the process status. ps is the short name for Process Status.
SYNTAX:
The Syntax is
ps [options]
OPTIONS:
List information about all processes most frequently requested: all those
-a
except process group leaders and processes not associated with a terminal..
-A or e List information for all processes.
-d List information about all processes except session leaders.
-e List information about every process now running.
-f Generates a full listing.
-j Print session ID and process group ID.
-l Generate a long listing.
EXAMPLE:
1. ps
Output:
PID TTY TIME CMD
2540 pts/1 00:00:00 bash
2621 pts/1 00:00:00 ps
In the above example, typing ps alone would list the current running processes.
2. ps -f
Output:
UID PID PPID C STIME TTY TIME CMD
SYNTAX:
The Syntax is
kill [-s] [-l] %pid
OPTIONS:
Specify the signal to send. The signal may be given as a signal name or
-s
number.
Write all values of signal supported by the implementation, if no operand is
-l
given.
-pid Process id or job id.
-9 Force to kill a process.
EXAMPLE:
command The name of a command that is to be invoked. If command names any of the
atq lists the user's pending jobs, unless the user is the superuser; in that case, everybody's
jobs are listed. The format of the output lines (one for each job) is: Job number, date,
hour, job class.
batch executes commands when system load levels permit; in other words, when the load
average drops below 1.5, or the value specified in the invocation of atrun.
at [-c | -k | -s] [-f filename] [-q queuename] [-m] -t time [date] [-l] [-r]
-c C shell. csh(1) is used to execute the at-job.
-t time Specifies at what time you want the command to be ran. Format hh:mm. am /
pm indication can also follow the time otherwise a 24-hour clock is used. A
timezone name of GMT, UCT or ZULU (case insensitive) can follow to
specify that the time is in Coordinated Universal Time. Other timezones can
be specified using the TZ environment variable. The below quick times can
also be entered:
date Specifies the date you wish it to be ran on. Format month, date, year. The
following quick days can also be entered:
FILTERS:
more COMMAND:
more command is used to display text in the terminal screen. It allows only backward
movement.
SYNTAX:
The Syntax is
more [options] filename
OPTIONS:
EXAMPLE:
1. more -c index.php
Clears the screen before printing the file .
2. more -3 index.php
Prints first three lines of the given file. Press Enter to display the file line by line.
head COMMAND:
head command is used to display the first ten lines of a file, and also specifies how many
lines to display.
SYNTAX:
The Syntax is
head [options] filename
OPTIONS:
1. head index.php
This command prints the first 10 lines of 'index.php'.
2. head -5 index.php
The head command displays the first 5 lines of 'index.php'.
3. head -c 5 index.php
The above command displays the first 5 characters of 'index.php'.
tail COMMAND:
tail command is used to display the last or bottom part of the file. By default it displays
last 10 lines of a file.
SYNTAX:
The Syntax is
tail [options] filename
OPTIONS:
1. tail index.php
SYNTAX:
The Syntax is
cut [options]
OPTIONS:
SYNTAX:
The Syntax is
paste [options]
OPTIONS:
EXAMPLE:
1. paste test.txt>test1.txt
Paste the content from 'test.txt' file to 'test1.txt' file.
2. ls | paste - - - -
List all files and directories in four columns for each line.
sort COMMAND:
sort command is used to sort the lines in a text file.
SYNTAX:
The Syntax is
sort [options] filename
OPTIONS:
1. sort test.txt
Sorts the 'test.txt'file and prints result in the screen.
2. sort -r test.txt
Sorts the 'test.txt' file in reverse order and prints result in the screen.
About uniq
Report or filter out repeated lines in a file.
Syntax
uniq [-c | -d | -u ] [ -f fields ] [ -s char ] [-n] [+m] [input_file [ output_file ] ]
-c Precede each output line with a count of the number of times the line
occurred in the input.
-d Suppress the writing of lines that are not repeated in the input.
-f fields Ignore the first fields fields on each input line when doing comparisons,
where fields is a positive decimal integer. A field is the maximal string
matched by the basic regular expression:
[[:blank:]]*[^[:blank:]]*
If fields specifies more fields than appear on an input line, a null string will
be used for comparison.
-s char Ignore the first chars characters when doing comparisons, where chars is a
positive decimal integer. If specified in conjunction with the -f option, the
first chars characters after the first fields fields will be ignored. If chars
specifies more characters than remain on an input line, a null string will be
used for comparison.
input_file A path name of the input file. If input_file is not specified, or if the input_file
is -, the
standard input will be used.
output_file A path name of the output file. If output_file is not specified, the standard
output will be used. The results are unspecified if the file named by
output_file is the file named by input_file.
Examples
uniq myfile1.txt > myfile2.txt - Removes duplicate lines in the first file1.txt and outputs the
results to the second file.
About tr
Translate characters.
Syntax
tr [-c] [-d] [-s] [string1] [string2]
-c Complement the set of characters specified by string1.
SYNTAX:
The Syntax is
date [options] [+format] [date]
OPTIONS:
Format:
%a Abbreviated weekday(Tue).
%A Full weekday(Tuesday).
%b Abbreviated month name(Jan).
%B Full month name(January).
%c Country-specific date and time format..
%D Date in the format %m/%d/%y.
%j Julian day of year (001-366).
%n Insert a new line.
%p String to indicate a.m. or p.m.
%T Time in the format %H:%M:%S.
%t Tab space.
%V Week number in year (01-52); start week on Monday.
EXAMPLE:
1. date command
date
The above command will print Wed Jul 23 10:52:34 IST 2008
2. To use tab space:
date +"Date is %D %t Time is %T"
The above command will remove space and print as
Date is 07/23/08 Time is 10:52:34
3. To know the week number of the year,
date -V
The above command will print 30
4. To set the date,
date -s "10/08/2008 11:37:23"
CREC,Dept.Of MCA Page 36
Linux Programming
The above command will print Wed Oct 08 11:37:23 IST 2008
who COMMAND:
who command can list the names of users currently logged in, their terminal, the time they
have been logged in, and the name of the host from which they have logged in.
SYNTAX:
The Syntax is
who [options] [file]
OPTIONS:
Print the username of the invoking user, The 'am' and 'i' must be space
am i
separated.
-b Prints time of last system boot.
-d print dead processes.
-H Print column headings above the output.
Include idle time as HOURS:MINUTES. An idle time of . indicates activity
-i
within the last minute.
-m Same as who am i.
-q Prints only the usernames and the user count/total no of users logged in.
-T,-w Include user's message status in the output.
EXAMPLE:
1. who -uH
Output:
NAME LINE TIME IDLE PID COMMENT
hiox ttyp3 Jul 10 11:08 . 4578
This sample output was produced at 11 a.m. The "." indiacates activity within the last
minute.
2. who am i
who am i command prints the user name.
echo COMMAND:
echo command prints the given input string to standard output.
SYNTAX:
The Syntax is
echo [options..] [string]
OPTIONS:
EXAMPLE:
1. echo command
echo "hscripts Hiox India"
The above command will print as hscripts Hiox India
2. To use backspace:
echo -e "hscripts \bHiox \bIndia"
The above command will remove space and print as hscriptsHioxIndia
3. To use tab space in echo command
echo -e "hscripts\tHiox\tIndia"
The above command will print as hscripts Hiox India
passwd COMMAND:
passwd command is used to change your password.
SYNTAX:
The Syntax is
passwd [options]
OPTIONS:
EXAMPLE:
1. passwd
Entering just passwd would allow you to change the password. After entering passwd
you will receive the following three prompts:
Current Password:
New Password:
Confirm New Password:
Each of these prompts must be entered correctly for the password to be successfully
changed.
pwd COMMAND:
pwd - Print Working Directory. pwd command prints the full filename of the current
working directory.
SYNTAX:
The Syntax is
pwd [options]
OPTIONS:
SYNTAX:
The Syntax is
cal [options] [month] [year]
OPTIONS:
EXAMPLE:
1. cal
Output:
September 2008
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30
cal command displays the current month calendar.
2. cal -3 5 2008
Output:
April 2008 May 2008 June 2008
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
1 2 3 4 5 1 2 3 1 2 3 4 5 6 7
6 7 8 9 10 11 12 4 5 6 7 8 9 10 8 9 10 11 12 13 14
13 14 15 16 17 18 19 11 12 13 14 15 16 17 15 16 17 18 19 20 21
20 21 22 23 24 25 26 18 19 20 21 22 23 24 22 23 24 25 26 27 28
27 28 29 30 25 26 27 28 29 30 31 29 30
Here the cal command displays the calendar of April, May and June month of year
2008.
login Command
Signs into a new system.
Syntax
login [ -p ] [ -d device ] [-h hostname | terminal | -r hostname ] [ name [ environ ] ]
-p Used to pass environment variables to the login shell.
-d device login accepts a device option, device. device is taken to be the path name of
the TTY port login is to operate on. The use of the device option can be
expected to improve login performance, since login will not need to call
ttyname. The -d option is available only to users whose UID and effective
UID are root. Any other attempt to use -d will cause login to quietly exit.
-h hostname | Used by in.telnetd to pass information about the remote host and terminal
terminal type.
uname [-a] [-i] [-m] [-n] [-p] [-r] [-s] [-v] [-X] [-S systemname]
-a Print basic information currently available from the system.
-m Print the machine hardware name (class). Use of this option is discouraged;
use uname -p instead.
-n Print the nodename (the nodename is the name by which the system is known
to a communications network).
Disk utilities
df is used to report the number of disk blocks and inodes used and free for each file system.
The output format and valid options are very specific to the OS and program version in use.
Syntax
df [options] [resource]
Common Options
du reports the amount of disk space in use for the files or directories you specify.
Syntax
du [options] [directory or file]
Common Options
du
1 ./.elm
1 ./Mail
1 ./News
20 ./uc
du -a uc
7 uc/unixgrep.txt
5 uc/editors.txt
1 uc/.emacs
1 uc/.exrc
4 uc/telnet.ftp
1 uc/uniq.tee.txt
20 uc
NETWORKING COMMANDS
TELNET and FTP are Application Level Internet protocols. The TELNET and FTP protocol
specifications have been implemented by many different sources, including The National
Center for Supercomputer Applications (NCSA), and many other public domain and
shareware sources rlogin is a remote login service that was at one time exclusive to Berkeley
Essentially, it offers the same functionality as telnet, except that it passes to the remote
computer information about the user's login environment. Machines can be configured to
allow connections from trusted hosts without prompting for the users’ passwordsA more
secure version of this protocol is the Secure SHell, SSH, software written by Tatu Ylonen
and available via ftp://ftp.net.ohio-state.edu/pub/security/ssh.
their commands—rsh (remotshell), rcp (remote copy), and rlogin (remote login)—were
prevalent in the past, but because they
offer little security, they’re generally discouraged in today’s environments. rsh and rlogin
are similar in functionality to telnet, and rcp is similar to ftp.
Common
Options ftp
telnet Action
-d same as above
(SVR4only) -i turn off
interactive prompting
telnet
solaris or
telnet 192.168.1
finger displays the .plan file of a specific user, or reports who is logged into a specific
machine. The user must allow general read permission on the .plan file.
Syntax
Common Options
Examples
Remote login
where the parts in brackets ([]) are optional. rcp does not prompt for passwords, so you must
have permission to execute remote commands on the specified machines as the selected user on
each machine.
Common Options
-l username connect as the user, username, on the remote host (rlogin & rsh)
Using ssh
ssh (Secure SHell) and telnet are two methods that enable you to log in to a remote system and
run commands interactively;
command hostname
ssh darwin
or
ssh 192.168.1.58
The ping Utility
The ping command sends an echo request to a host available on the network. Using this
command you can check if your remote host is responding well or not.
The ping command is useful for the following −
• Tracking and isolating hardware and software problems.
• Determining the status of the network and various foreign hosts.
• Testing, measuring, and managing networks.
Syntax
Following is the simple syntax to use ping command −
$ping hostname or ip-address
Above command would start printing a response after every second. To come out of the
command you can terminate it by pressing CNTRL + C keys.
Example
Following is the example to check the availability of a host available on the network −
$ping google.com
PING google.com (74.125.67.100) 56(84) bytes of data.
64 bytes from 74.125.67.100: icmp_seq=1 ttl=54 time=39.4 ms
64 bytes from 74.125.67.100: icmp_seq=2 ttl=54 time=39.9 ms
64 bytes from 74.125.67.100: icmp_seq=3 ttl=54 time=39.3 ms
64 bytes from 74.125.67.100: icmp_seq=4 ttl=54 time=39.1 ms
64 bytes from 74.125.67.100: icmp_seq=5 ttl=54 time=38.8 ms
--- google.com ping statistics ---
22 packets transmitted, 22 received, 0% packet loss, time 21017ms
rtt min/avg/max/mdev = 38.867/39.334/39.900/0.396 ms
$
If a host does not exist then it would behave something like this −
$ping giiiiiigle.com
ping: unknown host giiiiigle.com
$
The ftp Utility
Here ftp stands for File Transfer Protocol. This utility helps you to upload and download your
file from one computer to another computer.
The ftp utility has its own set of UNIX like commands which allow you to perform tasks such as
−
• Connect and login to a remote host.
• Navigate directories.
• List directory contents
• Put and get files
• Transfer files as ascii, ebcdic or binary
Syntax
Following is the simple syntax to use ping command −
$ftp hostname or ip-address
Above command would prompt you for login ID and password. Once you are authenticated, you
would have access on the home directory of the login account and you would be able to perform
various commands.
Few of the useful commands are listed below −
Command Description
put filename Upload filename from local machine to remote machine.
get filename Download filename from remote machine to local machine.
mput file list Upload more than one files from local machine to remote machine.
mget file list Download more than one files from remote machine to local machine.
Turns prompt off, by default you would be prompted to upload or
prompt off
download movies using mput or mget commands.
prompt on Turns prompt on.
Dir List all the files available in the current directory of remote machine.
cd dirname Change directory to dirname on remote machine.
lcd dirname Change directory to dirname on local machine.
Quit Logout from the current login.
It should be noted that all the files would be downloaded or uploaded to or from current
directories. If you want to upload your files in a particular directory then first you change to that
directory and then upload required files.
Example
Following is the example to show few commands −
$ftp amrood.com
Connected to amrood.com.
220 amrood.com FTP server (Ver 4.9 Thu Sep 2 20:35:07 CDT 2009)
Name (amrood.com:amrood): amrood
331 Password required for amrood.
Password:
230 User amrood logged in.
ftp> dir
200 PORT command successful.
150 Opening data connection for /bin/ls.
total 1464
drwxr-sr-x 3 amrood group 1024 Mar 11 20:04 Mail
drwxr-sr-x 2 amrood group 1536 Mar 3 18:07 Misc
drwxr-sr-x 5 amrood group 512 Dec 7 10:59 OldStuff
drwxr-sr-x 2 amrood group 1024 Mar 11 15:24 bin
drwxr-sr-x 5 amrood group 3072 Mar 13 16:10 mpl
-rw-r--r-- 1 amrood group 209671 Mar 15 10:57 myfile.out
drwxr-sr-x 3 amrood group 512 Jan 5 13:32 public
drwxr-sr-x 3 amrood group 512 Feb 10 10:17 pvm3
226 Transfer complete.
ftp> cd mpl
250 CWD command successful.
ftp> dir
200 PORT command successful.
150 Opening data connection for /bin/ls.
total 7320
-rw-r--r-- 1 amrood group 1630 Aug 8 1994 dboard.f
-rw-r----- 1 amrood group 4340 Jul 17 1994 vttest.c
-rwxr-xr-x 1 amrood group 525574 Feb 15 11:52 wave_shift
-rw-r--r-- 1 amrood group 1648 Aug 5 1994 wide.list
-rwxr-xr-x 1 amrood group 4019 Feb 14 16:26 fix.c
226 Transfer complete.
ftp> get wave_shift
200 PORT command successful.
150 Opening data connection for wave_shift (525574 bytes).
226 Transfer complete.
528454 bytes received in 1.296 seconds (398.1 Kbytes/s)
ftp> quit
221 Goodbye.
$
login: amrood
amrood's Password:
*****************************************************
* *
* *
* WELCOME TO AMROOD.COM *
* *
* *
*****************************************************
{ do your work }
$ logout
Connection closed.
C:>
The finger Utility
The finger command displays information about users on a given host. The host can be either
local or remote.
Finger may be disabled on other systems for security reasons.
Following are the simple syntax to use finger command −
Check all the logged in users on local machine as follows −
$ finger
Login Name Tty Idle Login Time Office
amrood pts/0 Jun 25 08:03 (62.61.164.115)
Get information about a specific user available on local machine −
$ finger amrood
Computers are connected in a network to exchange information or resources each other. Two or
more computer connected through network media called computer network. There are number of
network devices or media are involved to form computer network. Computer loaded with Linux
Operating System can also be a part of network whether it is small or large network by its
multitasking and multiuser natures. Maintaining of system and network up and running is a task
of System / Network Administrator’s job. In this article we are going to review frequently used
network configuration and troubleshoot commands in Linux.
Linux Network Configuration Commands
sort
File sort utility, often used as a filter in a pipe. This command sorts a text stream or file
forwards or backwards, or according to various keys or character positions. Using the -m
option, it merges presorted input files. The info page lists its many capabilities and
options.
tsort
The results of a tsort will usually differ markedly from those of the standard sort
command, above.
uniq
This filter removes duplicate lines from a sorted file. It is often seen in a pipe coupled
with sort.
The useful -c option prefixes each line of the input file with its number of occurrences.
The sort INPUTFILE | uniq -c | sort -nr command string produces a frequency of
occurrence listing on the INPUTFILE file (the -nr options to sort cause a reverse
numerical sort). This template finds use in analysis of log files and dictionary lists, and
wherever the lexical structure of a document needs to be examined.
&wf;
bash$ cat testfile
This line occurs only once.
This line occurs twice.
This line occurs twice.
This line occurs three times.
This line occurs three times.
This line occurs three times.
expand, unexpand
The unexpand filter converts spaces to tabs. This reverses the effect of expand.
cut
A tool for extracting fields from files. It is similar to the print $N command set in awk,
but more limited. It may be simpler to use cut in a script than awk. Particularly important
are the -d (delimiter) and -f (field specifier) options.
FILENAME=/etc/passwd
cut -d ' ' -f2,3 filename is equivalent to awk -F'[ ]' '{ print $2, $3 }' filename
Замечание
paste
Tool for merging together different files into a single, multi-column file. In combination
with cut, useful for creating system log files.
join
Consider this a special-purpose cousin of paste. This powerful utility allows merging two
files in a meaningful fashion, which essentially creates a simple version of a relational
database.
The join command operates on exactly two files, but pastes together only those lines with
a common tagged field (usually a numerical label), and writes the result to stdout. The
files to be joined should be sorted according to the tagged field for the matchups to work
properly.
File: 1.data
100 Shoes
200 Laces
300 Socks
File: 2.data
100 $40.00
200 $1.00
300 $2.00
bash$ join 1.data 2.data
File: 1.data 2.data
head
lists the beginning of a file to stdout. The default is 10 lines, but a different number can
be specified. The command has a number of interesting options.
&scriptdetector;
&rnd;
tail
lists the (tail) end of a file to stdout. The default is 10 lines, but this can be changed with
the -n option. Commonly used to keep track of changes to a system logfile, using the -f
option, which outputs lines appended to the file.
&ex12;
To list a specific line of a text file, pipe the output of head to tail -n 1. For example head -n 8
database.txt | tail -n 1 lists the 8th line of the file database.txt.
Newer implementations of tail deprecate the older tail -$LINES filename usage. The standard
tail -n $LINES filename is correct.
A multi-purpose file search tool that uses Regular Expressions. It was originally a
command/filter in the venerable ed line editor: g/re/p -- global - regular expression - print.
Search the target file(s) for occurrences of pattern, where pattern may be literal text or a Regular
Expression.
The -c (--count) option gives a numerical count of matches, rather than actually listing
the matches.
# grep -cz .
# ^ dot
# means count (-c) zero-separated (-z) items matching "."
# that is, non-empty ones (containing at least 1 character).
#
printf 'a b\nc d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -cz . # 3
printf 'a b\nc d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -cz '$' # 5
printf 'a b\nc d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -cz '^' # 5
#
# Thanks, S.C.
The --color (or --colour) option marks the matching string in color (on the console or in
an xterm window). Since grep prints out each entire line containing the matching pattern,
this lets you see exactly what is being matched. See also the -o option, which shows only
the matching portion of the line(s).
&fromsh;
When invoked with more than one target file given, grep specifies which file contains
matches.
To force grep to show the filename when searching only one target file, simply give /dev/null as
the second file.
How can grep search for two (or more) separate patterns? What if you want grep to display all
lines in a file or files that contain both «pattern1» and «pattern2»?
# Filename: tstfile
Now, let's search this file for lines containing both «file» and «text» . . .
egrep -- extended grep -- is the same as grep -E. This uses a somewhat different, extended set of
Regular Expressions, which can make the search a bit more flexible. It also allows the boolean |
(or) operator.
fgrep -- fast grep -- is the same as grep -F. It does a literal string search (no Regular
Expressions), which generally speeds things up a bit.
On some Linux distros, egrep and fgrep are symbolic links to, or aliases for grep, but invoked
with the -E and -F options, respectively.
&dictlookup;
agrep (approximate grep) extends the capabilities of grep to approximate matching. The search
string may differ by a specified number of characters from the resulting matches. This utility is
not part of the core Linux distribution.
To search compressed files, use zgrep, zegrep, or zfgrep. These also work on
non-compressed files, though slower than plain grep, egrep, fgrep. They are
handy for searching through a mixed set of files, some compressed, some not.
look
The command look works like grep, but does a lookup on a «dictionary,» a sorted word
list. By default, look searches for a match in /usr/dict/words, but a different dictionary
file may be specified.
&lookup;
sed, awk
Scripting languages especially suited for parsing text files and command output. May be
embedded singly or in combination in pipes and shell scripts.
sed
awk
Programmable file extractor and formatter, good for manipulating and/or extracting fields
(columns) in structured text files. Its syntax is similar to C.
wc
bash $ wc /usr/share/doc/sed-4.1.2/README
13 70 447 README
[13 lines 70 words 447 characters]
wc -w gives only the word count.
wc -l gives only the line count.
wc -c gives only the byte count.
wc -m gives only the character count.
wc -L gives only the length of the longest line.
Using wc to count how many .txt files are in current working directory:
$ ls *.txt | wc -l
# Will work as long as none of the "*.txt" files
#+ have a linefeed embedded in their name.
# Thanks, S.C.
Using wc to total up the size of all the files whose names begin with letters in the range d
-h
Using wc to count the instances of the word «Linux» in the main source file for this book.
# Thanks, S.C.
Tr command
character translation filter
Must use quoting and/or brackets, as appropriate. Quotes prevent the shell from reinterpreting
the special characters in tr command sequences. Brackets should be quoted to prevent expansion
by the shell.
Either tr "A-Z" "*" <filename or tr A-Z \* <filename changes all the uppercase letters in
filename to asterisks (writes to stdout). On some systems this may not work, but tr A-Z '[**]'
will.
tr -d 0-9 <filename
# Deletes all digits from the file "filename".
The --squeeze-repeats (or -s) option deletes all but the first instance of a string of consecutive
characters. This option is useful for removing excess whitespace.
bash$ echo "XXXXX" | tr --squeeze-repeats 'X'
X
The -c «complement» option inverts the character set to match. With this option, tr acts only
upon those characters not matching the specified set.
bash$ echo "acfdeb123" | tr -c b-d +
+c+d+b++++
Note that tr recognizes POSIX character classes.
bash$ echo "abcd2ef1" | tr '[:alpha:]' -
----2--1
tr variants
The tr utility has two historic variants. The BSD version does not use brackets (tr a-z A-
Z), but the SysV one does (tr '[a-z]' '[A-Z]'). The GNU version of tr resembles the BSD
one.
fold
A filter that wraps lines of input to a specified width. This is especially useful with the -s
option, which breaks lines at word spaces
fmt
Simple-minded file formatter, used as a filter in a pipe to «wrap» long lines of text
output.
15.26. Formatted file listing.
&ex50;
col
This deceptively named filter removes reverse line feeds from an input stream. It also
attempts to replace whitespace with equivalent tabs. The chief use of col is in filtering the
output from certain text processing utilities, such as groff and tbl.
column
Column formatter. This filter transforms list-type text output into a «pretty-printed» table
by inserting tabs at appropriate places.
Using column to format a directory listing
&colm;
colrm
Column removal filter. This removes columns (characters) from a file and writes the file,
lacking the range of specified columns, back to stdout. colrm 2 4 <filename removes the
second through fourth characters from each line of the text file filename.
If the file contains tabs or nonprintable characters, this may cause unpredictable behavior.
In such cases, consider using expand and unexpand in a pipe preceding colrm.
nl
Line numbering filter: nl filename lists filename to stdout, but inserts consecutive
numbers at the beginning of each non-blank line. If filename omitted, operates on stdin.
The output of nl is very similar to cat -b, since, by default nl does not list blank lines.
15.28. nl: A self-numbering script.
&lnum;
pr
Print formatting filter. This will paginate files (or stdout) into sections suitable for hard
copy printing or viewing on screen. Various options permit row and column
manipulation, joining lines, setting margins, numbering lines, adding page headers, and
merging files, among other things. The pr command combines much of the functionality
of nl, paste, fold, column, and expand.
pr -o 5 --width=65 fileZZZ | more gives a nice paginated listing to screen of fileZZZ
with margins set at 5 and 65.
A particularly useful option is -d, forcing double-spacing (same effect as sed -G).
SED:
What is sed?
A non-interactive stream editor
Interprets sed instructions and performs actions
Use sed to:
Automatically perform edits on file(s)
Simplify doing the same edits on multiple files
Write conversion programs
SED OPERATION
Set-of-Lines address
Range address
Nested address
SINGLE-LINE ADDRESS
Specifies only one line in the input file
special: dollar sign ($) denotes last line of input file
Examples:
show only line 3
sed -n -e '3 p' input-file
show only last line
sed -n -e '$ p' input-file
substitute “endif” with “fi” on line 10
sed -e '10 s/endif/fi/' input-file
SET-OF-LINES ADDRESS
use regular expression to match lines
written between two slashes
process only lines that match
may match several lines
lines may or may not be consecutives
Examples:
sed -e ‘/key/ s/more/other/’ input-file
sed -n -e ‘/r..t/ p’ input-file
RANGE ADDRESS
Defines a set of consecutive lines
Format:
start-addr,end-addr (inclusive)
Examples:
10,50 line-number,line-number
10,/R.E/ line-number,/RegExp/
/R.E./,10 /RegExp/,line-number
/R.E./,/R.E/ /RegExp/,/RegExp/
Example: Range Address
% sed -n -e ‘/^BEGIN$/,/^END$/p’ input-file
Print lines between BEGIN and END, inclusive
BEGIN
Line 1 of input
Line 2 of input
Line3 of input
END
Line 4 of input
Line 5 of input
Nested Address
Nested address contained within another address
Example:
print blank lines between line 20 and 30
20,30{
/^$/ p
}
Address with !
address with an exclamation point (!):
instruction will be applied to all lines that do not match the address
Example:
print lines that do not contain “obsolete”
sed -e ‘/obsolete/!p’ input-file
SED COMMANDS
Line Number
line number command (=) writes the current line number before each matched/output line
Examples:
sed -e '/Two-thirds-time/=' tuition.data
sed -e '/^[0-9][0-9]/=' inventory
modify commands
Insert Command: i
adds one or more lines directly to the output before the address:
The octal dump command (od -c) can be used to produce similar result
Hold Space
temporary storage area
used to save the contents of the pattern space
4 commands that can be used to move text back and forth between the pattern space and
the hold space:
h, H
g, G
File commands
allows to read and write from/to file while processing standard input
read: r command
write: w command
Read File command
Syntax: r filename
queue the contents of filename to be read and inserted into the output stream at
the end of the current cycle, or when the next input line is read
if filename cannot be read, it is treated as if it were an empty file, without
any error indication
single address only
Write File command
Syntax: w filename
Write the pattern space to filename
The filename will be created (or truncated) before the first input line is read
all w commands which refer to the same filename are output through the same
FILE stream
Branch Command (b)
Change the regular flow of the commands in the script file
Syntax: [addr1][,addr2]b[label]
Branch (unconditionally) to ‘label’ or end of script
If “label” is supplied, execution resumes at the line following :label; otherwise,
control passes to the end of the script
Branch label
:mylabel
Example: The quit (q) Command
Syntax: [addr]q
Quit (exit sed) when addr is encountered.
Example: Display the first 50 lines and quit
% sed -e ’50q’ datafile
Same as:
AWK
WHAT IS AWK?
created by: Aho, Weinberger, and Kernighan
scripting language used for manipulating data and generating reports
versions of awk
awk, nawk, mawk, pgawk, …
GNU awk: gawk
What can you do with awk?
awk operation:
scans a file line by line
splits each input line into fields
compares input line/fields to pattern
performs action(s) on matched lines
Useful for:
transform data files
produce formatted reports
Programming constructs:
format output lines
arithmetic and string operations
conditionals and loops
The Command:
awk
Buffers
AWK SCRIPTS
awk scripts are divided into three major parts:
Categories of Patterns
Tom Jones:4424:5/12/66:543354
Mary Adams:5346:11/4/63:28765
Sally Chang:1654:7/22/54:650000
Billy Black:1683:9/23/44:336500
% awk –F: '/00$/' employees2
Sally Chang:1654:7/22/54:650000
Billy Black:1683:9/23/44:336500
Example: explicit match
% cat datafile
northwest NW Charles Main 3.0 .98 3 34
western WE Sharon Gray 5.3 .97 5 23
southwest SW Lewis Dalsass 2.7 .8 2 18
southern SO Suan Chin 5.1 .95 4 15
southeast SE Patricia Hemenway 4.0 .7 4 17
eastern EA TB Savage 4.4 .84 5 20
northeast NE AM Main 5.1 .94 3 13
north NO Margot Weber 4.5 .89 5 9
central CT Ann Stephens 5.7 .94 5 13
% awk '$5 ~ /\.[7-9]+/' datafile
southwest SW Lewis Dalsass 2.7 .8 2 18
central CT Ann Stephens 5.7 .94 5 13
Examples: matching with REs
% awk '$2 !~ /E/{print $1, $2}' datafile
northwest NW
southwest SW
southern SO
north NO
central CT
% awk '/^[ns]/{print $1}' datafile
northwest
southwest
southern
southeast
northeast
north
ARITHMETIC OPERATORS
Operator Meaning Example
+ Add x+y
- Subtract x–y
* Multiply x*y
/ Divide x/y
% Modulus x%y
^ Exponential x^y
Example:
% awk '$3 * $4 > 500 {print $0}' file
Relational Operators
Operator Meaning Example
< Less than x<y
<= Less than or equal x<=y
== Equal to x == y
!= Not equal to x != y
> Greater than x>y
>= Greater than or equal to x>=y
~ Matched by reg exp x ~ /y/
!~ Not matched by req exp x !~ /y/
Logical Operators
Operator Meaning Example
&& Logical AND a && b
|| Logical OR a || b
! NOT !a
Examples:
% awk '($2 > 5) && ($2 <= 15) {print $0}' file
% awk '$3 == 100 || $4 > 50' file
RANGE PATTERNS
Matches ranges of consecutive input lines
Syntax:
pattern1 , pattern2 {action}
pattern can be any simple pattern
pattern1 turns action on
pattern2 turns action off
Range Pattern Example
AWK ACTIONS
AWK EXPRESSIONS
Expression is evaluated and returns value
consists of any combination of numeric and string constants, variables, operators,
functions, and regular expressions
Can involve variables
As part of expression evaluation
As target of assignment
awk variables
jasper 84
john 85
% awk '{print $1,$2 | "sort –k 2"}' grades
jasper 84
john 85
andrea 89
% date
Wed Nov 19 14:40:07 CST 2008
% date |
awk '{print "Month: " $2 "\nYear: ", $6}'
Month: Nov
Year: 2008
printf: Formatting output
Syntax:
printf(format-string, var1, var2, …)
works like C printf
each format specifier in “format-string” requires argument of matching type
Format specifiers
%d, %i decimal integer
%c single character
%s string of characters
%f floating point number
%o octal number
%x hexadecimal number
%e scientific floating point notation
%% the letter “%”
Format specifier examples
Given: x = ‘A’, y = 15, z = 2.3, and $1 = Bob Smith
print "======================================"
}
{
printf("%3d\t%-20s\t%6.2f\n", $1, $2, $3)
count++
}
END {
print "======================================"
print "Catalog has " count " parts"
}
awk Array
awk allows one-dimensional arrays
to store strings or numbers
index can be number or string
array need not be declared
its size
its elements
array elements are created when first used
initialized to 0 or “”
Arrays in awk
Syntax:
arrayName[index] = value
Examples:
list[1] = "one"
list[2] = "three"
list["other"] = "oh my !"
Illustration: Associative Arrays
awk arrays can use string as index
output:
summary of category sales
Illustration: process each input line
deptSales[$2] += $3
}
END {
for (x in deptSales)
print x, deptSales[x]
}
% awk –f sales.awk sales
Awk control structures
Conditional
if-else
Repetition
for
with counter
with array index
while
do-while
also: break, continue
if Statement
Syntax:
if (conditional expression)
statement-1
else
statement-2
Example:
if ( NR < 3 )
print $2
else
print $3
for Loop
Syntax:
for (initialization; limit-test; update)
statement
Example:
for (i = 1; i <= NR; i++)
{
total += $i
count++
}
for Loop for arrays
Syntax:
UNIT-II
UNIT-II
• A Unix shell, also called "the command line", provides the traditional user interface for
the Unix operating system and for Unix-like systems. Users direct the operation of the
computer by entering command input as text for a shell to execute.
– C shell (csh)
• When we issue a command the shell is the first agency to acquire the information. It
accepts and interprets user requests. The shell examines &rebuilds the commands
&leaves the execution work to kernel. The kernel handles the h/w on behalf of these
commands &all processes in the system.
• The shell is generally sleeping. It wakes up when an input is keyed in at the prompt. This
INPUT IS ACTUALLY INPUT TO THE PROGRAM THAT REPRESENTS THE
SHELL.
SHELL RESPONSIBILITIES
• 1. Program Execution
• 3. I/O Redirection
• 4. Pipeline Hookup
• 5. Environment Control
Program Execution:
• The shell is responsible for the execution of all programs that you request from your
terminal.
• Each time you type in a line to the shell, the shell analyzes the line and then determines
what to do.
• The line that is typed to the shell is known more formally as the command line. The shell
scans this command line and determines the name of the program to be executed and
what arguments to pass to the program.
• Like any other programming language, the shell lets you assign values to variables.
Whenever you specify one of these variables on the command line, preceded by a dollar
sign, the shell substitutes the value assigned to the variable at that point.
I/O Redirection:
• It is the shell's responsibility to take care of input and output redirection on the command
line. It scans the command line for the occurrence of the special redirection characters <,
>, or >>.
Pipeline Hookup:
• Just as the shell scans the command line looking for redirection characters, it also looks
for the pipe character |. For each such character that it finds, it connects the standard
output from the command preceding the | to the standard input of the one following the |.
It then initiates execution of both programs.
Environment Control:
• The shell provides certain commands that let you customize your environment. Your
environment includes home directory, the characters that the shell displays to prompt you
to type in a command, and a list of the directories to be searched whenever you request
that a program be executed.
• The shell has its own built-in programming language. This language is interpreted,
meaning that the shell analyzes each statement in the language one line at a time and then
executes it. This differs from programming languages such as C and FORTRAN, in
which the programming statements are typically compiled into a machine-executable
form before they are executed.
PIPES
• Standard I/p & standard o/p constitute two separate streams that can be individually
manipulated by the shell. The shell connects these streams so that one command takes I
/p from other using pipes.
Who produces the list of users , to save this o/p in a file use
$wc –l <user.lst
2.An intermediate file is required that has to be removed after the command has completed its
run.
3.When handling large files, temporary files can build up easily &eat up disk space in no time.
• Instead of using two separate commands, the shell can use a special operator as the
connector of two commands-the pipe(|).
$who | wc –l
$ls | wc –l
REDIRECTION
• Many of the commands that we used sent their output to the terminal and also taking the
input from the keyboard. These commands are designed that way to accept not only fixed
sources and destinations. They are actually designed to use a character stream without
knowing its source and destination.
• A stream is a sequence of bytes that many commands see as input and output. Unix treats
these streams as files and a group of unix commands reads from and writes to these files.
• There are 3 streams or standard files. The shell sets up these 3 standard files and attaches
them to user terminal at the time of logging in.
• Instead of input coming from the keyboard and output and error going to the terminal,
they can be redirected to come from or go to any file or some other device.
• Using the symbols >,>> u can redirect the o/p of a command to a file.
$who> newfile
• If the output file does not exist the shell creates it before executing the command. If it
exists the shell overwrites it.
• $who>> newfile
STANDARD I/P:
STANDARD ERROR:
• When u enter an incorrect command or trying to open a non existing file, certain
diagnostic messages show up on the screen. This is the standard error stream.
$cat bar
Each of the standard files has a number called a file descriptor, which is used for identification.
0—standard i/p
1---standard o/p
2---standard error
HERE DOCUMENTS
• There are occasions when the data of ur program reads is fixed & fairly limited.
• The shell uses << symbols to read data from the same file containing the script. This
referred to as a here document, signifying that the data is here rather than in a separate
file.
• Any command using standard i/p can also take i/p from a here document.
• This feature is useful when used with commands that don’t accept a file name as
argument.
• Example:
Ur pgm for printing the invoices has been executed on `date`. Check the print queue
MARK
• The shell treats every line followed by three lines of data and a delimited by MARK as
input to the command. Juliet at other end will only see the three lines of message text, the
word MARK itself doesn’t show up.
• Shell metacharacters
• The shell consists of large no. of metacharacters. These characters plays vital role in Unix
programming.
• $filename
• $sh filename
TYPES OF METACHARACTERS:
• 1.File substitution
• 2.I/O redirection
• 3.Process execution
• 4.Quoting metacharacters
• 5.Positional parameters
• 6.Special characters
Filename substitution:
• Metacharacter significance
I/O redirection:
• These special characters specify from where to take i/p & where to send o/p.
• < - to take i/p from specific location but not from keyboard.
• >>- to save the o/p in a particular file at the end of that file without overwriting it.
Process execution:
• ; -is used when u want to execute more then one command at $ prompt.
• && -this is used when u want to execute the second command only if the first command
executed successfully.
Quoting:
• \ (backslash)- negates the special property of the single character following it.
• Eg:$echo \? \* \?
• ?*?
• “ “(pair of double quotes)-negates the special properties of all enclosed characters except
$,`,\ .
Positional parameters:
Special parameters:
SHELL VARIABLES
• U can define & use variables both in the command line and shell scripts. These variables
are called shell variables.
• Variables provide the ability to store and manipulate the information with in the shell
program. The variables are completely under the control of user.
User-defined variables:
Generalized form:
variable=value.
Eg: $x=10
$echo $x
10
$unset x
• All shell variables are initialized to null strings by default. To explicitly set null values
use
x= or x=‘’ or x=“”
Environment Variables
• They are initialized when the shell script starts and normally capitalized to distinguish
them from user-defined variables in scripts
• To display all variables in the local shell and their values, type the set command
• The unset command removes the variable from the current shell and sub shell
$# No . of parameters passed
Parameter Variable
Shell commands
read:
• The read statement is a tool for taking input from the user i.e. making scripts interactive. It is used with o
Input supplied through the standard input is read into these variables.
$read name
What ever you u entered is stored in the variable name.
printf:
Printf is used to print formatted o/p.
printf "format" arg1 arg2 ...
Eg:
$ printf "This is a number: %d\n" 10
This is a number: 10
$
Printf supports conversion specification characters like %d, %s ,%x ,%o….
Exit status of a command:
• Every command returns a value after execution .
• This value is called the exit status or return value of a command.
• This value is said to be true if the command executes successfully and false if it fails.
There is special parameter used by the shell it is the $?. It stores the exit status of a command
exit:
• The exit statement is used to prematurely terminate a program.
• When this statement is encountered in a script, execution is halted and control is
----- returned to the calling program-in most cases the shell.
• U don’t need to place exit at the end of every shell script because the shell knows
-----when script execution is complete.
set:
$set
• It is a null command.
• In some older shell scripts, colon was used at the start of a line to introduce a comment,
but modern scripts uses # now.
expr:
• The expr command evaluates its arguments as an expression:
• $ expr 8 + 6
• 14
• $ x=`expr 12 / 4 `
• $ echo $x
• 3
export:
• There is a way to make the value of a variable known to a sub shell, and that's by
exporting it with the export command. The format of this command is
export variables
where variables is the list of variable names that you want exported. For any sub shells that get
executed from that point on, the value of the exported variables will be passed down to the sub
shell.
eval:
• eval scans the command line twice before executing it. General form for eval is
eval command-line
Eg:
$ cat last
four
${n}
If u supply more than nine arguments to a program, u cannot access the tenth and greater
arguments with $10, $11, and so on.
${n} must be used. So to directly access argument 10, you must write
${10}
Shift command:
The shift command allows u to effectively left shift your positional parameters. If u execute the
command
Shift
What ever was previously stored inside $2 will be assigned to $1, whatever was previously
stored in $3 will be assigned to $2, and so on. The old value of $1 will be irretrievably lost.
File Conditions
-d file True if the file is a directory
Example
$ mkdir temp
$ if [ -f temp ]; then
> fi
Arithmetic Comparison
! ep True if ep is false
Example
$ x=5; y=7
$ if [ $x -lt $y ]; then
> fi
do-while, repeat-until
for
select
CREC,Dept.Of MCA Page 104
Linux Programming
Functions
Traps
User input
Syntax:
read varname [more vars]
or
Parameter Meaning
$1 $2 $3 $4
% echo $*
% echo $#
% echo $1
tim
% echo $3 $4
ann fred
case
loops
for
while
until
select
if statement
if command
then
statements
fi
statements are executed only if command succeeds, i.e. has return status “0”
test command
Syntax:
test expression
[ expression ]
Example:
if test –w "$1"
then
fi
statements
fi
statements-1
else
statements-2
fi
The if…statement
if [ condition ]; then
statements
statement
else
statements
fi
Relational Operators
Logical operators
! not
&& and
|| or
#!/bin/bash
else
fi
#!/bin/bash
Bonus=500
then
else
fi
#!/bin/bash
then
else
fi
File Testing
Meaning
#!/bin/bash
read filename
if [ ! –r "$filename" ]
then
exit 1
fi
#! /bin/bash
if [ $# -lt 1 ]; then
exit 1
fi
then
exit 1
fi
fi
fi
* "TEST" COMMAND
fi
#!/bin/bash
let Net=$Income-$Expense
else
fi
use the case statement for a decision that is based on multiple choices
Syntax:
case word in
pattern1) command-list1
;;
pattern2) command-list2
;;
patternN) command-listN
;;
esac
case pattern
[…]
[:class:]
#!/bin/bash
case $reply in
ls -a ;;
ls ;;
Q) exit 0 ;;
esac
#!/bin/bash
ChildRate=3
AdultRate=10
SeniorRate=7
case $age in
[1][3-9]|[2-5][0-9])
esac
Data structure
Variables
Numeric variables
Arrays
User input
Control structures
if-then-else
case
Control structures
Repetition
do-while, repeat-until
for
select
Functions
Trapping signals
Repetition Constructs
Syntax:
while [ expression ]
do
command-list
done
#!/bin/bash
COUNTER=0
do
let COUNTER=$COUNTER+1
done
Cont="Y"
ps -A
done
echo "done"
#!/bin/bash
PICSDIR=/home/carol/pics
WEBDIR=/var/www/carol/webcam
while true; do
DATE=`date +%Y%m%d`
HOUR=`date +%H`
mkdir $WEBDIR/"$DATE"
DESTDIR=$WEBDIR/"$DATE"/"$HOUR"
mkdir "$DESTDIR"
mv $PICSDIR/*.jpg "$DESTDIR"/
sleep 3600
HOUR=`date +%H`
done
done
Syntax:
until [ expression ]
do
command-list
done
#!/bin/bash
COUNTER=20
do
echo $COUNTER
let COUNTER-=1
done
#!/bin/bash
Stop="N"
ps -A
done
echo "done"
Syntax:
do
commands
done
#!/bin/bash
for i in 7 9 2 3 4 5
do
echo $i
done
#!/bin/bash
for num in 1 2 3 4 5 6 7
do
let TempTotal=$TempTotal+$Temp
done
let AvgTemp=$TempTotal/7
#! /bin/bash
for parm
do
echo $parm
done
Select command
Constructs simple menu from word list
Syntax:
do
RESPECTIVE-COMMANDS
done
Select example
#! /bin/bash
do
echo $var
done
Prints:
) alpha
2) beta
3) gamma
#? 2
beta
#? 4
#? 1
alpha
Select detail
#! /bin/bash
do
done
Output:
select ...
1) alpha
2) beta
?2
2 = beta
?1
1 = alpha
#!/bin/bash
select FILENAME in *
do
done
Output:
select ...
1) alpha
2) beta
?2
2 = beta
?1
1 = alpha
while [ condition ]
do
cmd-1
break
cmd-n
done
echo "done"
while [ condition ]
do
cmd-1
continue
cmd-n
done
echo "done"
Example:
for index in 1 2 3 4 5 6 7 8 9 10
do
echo "continue"
continue
fi
echo $index
echo "break"
break
fi
done
Decision:
if-then-else
case
Repetition
do-while, repeat-until
for
select
Functions
Traps
Shell Functions
A shell function is similar to a shell script
Where to define
In .profile
In your script
Remove a function
Syntax:
function-name () {
statements
Example: function
#!/bin/bash
funky () {
funky
Example: function
#!/bin/bash
JUST_A_SECOND=1
let i=0
REPEATS=30
do
sleep $JUST_A_SECOND
let i+=1
done
fun
Function parameters
Need not be declared
Arguments provided via function call are accessible inside function as $1, $2, $3, …
testfile() {
if [ $# -gt 0 ]; then
else
fi
fi
testfile .
testfile funtest
#! /bin/bash
checkfile() {
for file
do
if [ -f "$file" ]; then
else
if [ -d "$file" ]; then
fi
fi
done
checkfile . funtest
o i.e. their values are known throughout the entire shell program
• keyword “local” inside a function definition makes referenced variables “local” to that
function
Example: function
#! /bin/bash
foo () {
echo $global
echo $inside
global="better variable"
echo $global
foo
echo $global
echo $inside
Handling signals
Unix allows you to send a signal to any process
-9 cannot be blocked
ps -u userid
Signals on Linux
% kill –l
#!/bin/sh
read timeofday
else
fi
exit 0
#!/bin/sh
read timeofday
else
exit 1
fi
exit 0
#!/bin/sh
read timeofday
else
exit 1
fi
exit 0
#!/bin/sh
read timeofday
else
exit 1
fi
exit 0
case
case variable in
....
esac
#!/bin/sh
case "$timeofday" in
exit 1;;
esac
^C is 2 - SIGINT
under this concept we discuss the functions or methods of regular file. On each function we discuss the
usage, syntax, arguments, argument values and return types
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
open -oflag
•O_SYNC Any writes on the resulting file descriptor will block the calling process until the data
open -mode
•This mode only applies to future accesses of the newly created file.
Group: S_IRWXG,S_IRGRP,S_IWGRP,S_IXGRP
Identifying errors
–Using errno – a global variable set by the system call if an error has occurred.
–Defined in errno.h
#include <errno.h>
int fd;
if( fd < 0 ) {
return -1;
•ENOENT -O_CREAT is not set and the named file does not exist.
•EROFS -The named file resides on a read-only file system, and write access was requested.
•ENOSPC -O_CREAT is specified, the file does not exist, and there is no space left on the file system
containing the directory.
•EMFILE -The process has already reached its limit for open file descriptors.
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/types.h>
#include <unistd.h>
• fd
• offset
–Repositions the offset of the file descriptor fd to the argument offset according to the directive
whence.
•Return value
lseek–whence
•SEEK_SET -The offset is set to offset bytes from the beginning of the file.
•SEEK_CUR -The offset is set to its current location plus offset bytes.
•SEEK_END -The offset is set to the size of the file plus offset bytes.
–If we use SEEK_END and then write to the file, it extends the file size in kernel and the gap is
lseek: Examples
lseek-errno
• lseek() will fail and the file pointer will remain unchanged if:
#include <unistd.h>
•Attempts to read nbytes of data from the object referenced by the descriptor fd into the buffer pointed to
by buff.
•Otherwise, -1 is returned and the global variable errno is set to indicate the error.read -errno
•EBADF -fd is not a valid file descriptor or it is not open for reading.
•EIO -An I/O error occurred while reading from the file system.
•EINTR The call was interrupted by a signal before any data was read
•EAGAIN-The file was marked for non-blocking I/O, and no data was ready to be read.
#include <unistd.h>
•Attempts to write nbytes of data to the object referenced by the descriptor fd from the buffer pointed to
by buff.
•“A successful return from write() does not make any guarantee that data has been committed to disk.”
write -errno
•EPIPE -An attempt is made to write to a pipe that is not open for reading by any process.
•EFBIG -An attempt was made to write a file that exceeds the maximum file size.
•EINVAL -fdis attached to an object which is unsuitable for writing (such as keyboards).
•ENOSPC -There is no free space remaining on the file system containing the file.
•EDQUOT -The user's quota of disk blocks on the file system containing the file has been exhausted.
•EIO -An I/O error occurred while writing to the file system.
•EAGAIN -The file was marked for non-blocking I/O, and no data could be written immediately.
Example
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
char buf1[] = "abcdefghij";
char buf2[] = "ABCDEFGHIJ";
int main(void) {
int fd;
fd = creat("file.hole", S_IRUSR|S_IWUSR|IRGRP));
if( fd< 0 ) {
perror("createrror");
exit(1);
}
if( write(fd, buf1, 10) != 10 ) {
perror("buf1 write error");
exit(1);
}
/* offset now = 10 */
if( lseek(fd, 40, SEEK_SET) == -1 )
{
perror("lseek error");
exit(1);
}
/* offset now = 40 */
if(write(fd, buf2, 10) != 10)
{
perror("buf2 write error");
exit(1);
}
/* offset now = 50 */
exit(0);
}
Example –copying a file
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
enum {BUF_SIZE = 16};
int main(int argc, char* argv[])
{
int fdread, fdwrite;
unsigned int total_bytes = 0;
ssize_t nbytes_read, nbytes_write;
char buf[BUF_SIZE];
if (argc != 3) {
printf("Usage: %s source destination\n",
argv[0]);
exit(1);
}
fdread = open(argv[1], O_RDONLY);
if (fdread < 0) {
perror("Failed to open source file");
exit(1);
}
fdwrite = creat(argv[2], S_IRWXU);
if (fdwrite < 0) {
perror("Failed to open detination file");
exit(1);
}
do {
#include <unistd.h>
•Duplicates an existing object descriptor and returns its value to the calling process.
•Causes the file descriptor newfd to refer to the same file as oldfd. The object referenced by the descriptor
does not distinguish between oldfd and newfd in any way.
dup2 -errno
•EBADF - oldfd isn't an open file descriptor, or newfd is out of the allowed range for file descriptors.
•Note: If a separate pointer to the file is desired, a different object reference to the file must be obtained
by issuing an additional open() call.
Dup2 -comments
• dup2(fd, 0) - whenever the program tries toread from standard input, it will read from fd.
• dup2(fd, 1) -whenever the program tries to writeto standard output, it will write to fd.
•After arranging the redirections, the desired program is run using exec
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
•manipulate filedescriptors.
fcntl-cmd
fcntl–example 1
#include <stdio.h>
#include <sys/types.h>
#include <fcntl.h>
intaccmode, val;
if( argc!= 2 ) {
exit(1);
if (val< 0 ) {
exit( 1 );
else {
exit(1);
printf(", nonblocking");
putchar( '\n‘);
exit(0);
fcntl–example 2
#include <stdio.h>
#include <sys/types.h>
#include <fcntl.h>
intval;
if (val< 0 ) {
exit( 1 );
if( val< 0 ) {
exit( 1 );
#include <unistd.h>
•Atomically creates the specified directory entry (hard link) newpath with the attributes of the underlying
object pointed at by existingpath.
•If the link is successful: the link count of the underlying object is incremented; newpath and existingpath
share equal access and rights to the underlying object.
•If existingpath is removed, the file newpath is not deleted and the link count of the underlying object is
decremented.
link() example
~>ls -l
total 8
•Make a link in C:
link("test","new_name”)
~>ls-l
total 9
•To the normal user,a symbolic link behaves like a file,but the underlying mechanism is different.
•Creates a special type of file whose contents are the name of the target file
•The existence of the file is not affected by the existence of symbolic links to it.
link() example
~> ls -l
total 7
•Make a link in C:
symlink("test","new_name”)
~> ls -l
total 8
•Removes the link (soft or hard) namedby pathname from its directory
•If a hard link, decrements the link count of the file which was referenced by the link.
•If that decrement reduces the link count of the file to zero, and no process has the file open, then all
resources associated with the file are reclaimed.
•If one or more processes have the file open when the last link is removed, the link is removed, but the
removal of the file is delayed until all references to it have been closed.remove intremove(constchar
*pathname);
•If path specifies a directory, remove(path) is the equivalent of rmdir(path). Otherwise, it is the equivalent
of unlink(path).
•If oldname is a symbolic link, the symbolic link is renamed, not the file or directory to which it points.
#include <sys/types.h>
#include <sys/stat.h>
stat
•Read, write or execute permission of the named file is not required, but all directories listed in the path
name leading to the file must be searchable.
fstat
•Obtains the same information about an open file known by the file descriptor fd.
•like stat() except that ifthe named file is a symbolic link, lstat() returns information about the link, while
stat() returns information about the file the link references.
struct stat
Struct stat {
ino_tst_ino; /* inode’snumber */
gid_tst_gid; /* group ID */
#include <sys/types.h>
#include <sys/stat.h>
mode_tumask(mode_tcmask)
•Theumaskcommand permission bits to block when auser creates directories and files
•The bits inthe umaskare turned off from the modeargument to open.
•Example: Ifthe umaskvalue is octal 022, the results in new files being created with permissions 0666 is
0666 &
#include <sys/types.h>
#include <sys/stat.h>
intchmod(constchar *pathname, mode_tmode)
•Sets the file permission bits of the file specified by the pathname pathname to mode.
•User mustbe file ownerto change mode
#include <sys/types.h>
#include <unistd.h>
intchown(constchar *pathname,uid_towner,gid_tgroup);
•The owner ID and group ID of the file named by pathname is changed as specified by the arguments
owner and group.
•The owner of a file may change the group.
•Changing the owner is usually allowed only to the superuser.
FILE LOCKS:
File locking is a mechanism that restricts access to a computer file by only allowing one user or process
access at any specific time. Systems implement locking to prevent the classic interceding update scenario
the following example illustrates the interceding update problem:
1. Process A reads a customer record from a file containing account information, including the
customer's account balance and phone number.
2. Process B now reads the same record from the same file so it has its own copy.
3. Process A changes the account balance in its copy of the customer record and writes the record
back to the file.
4. Process B—which still has the original stale value for the account balance in its copy of the
customer record—updates the customer's phone number and writes the customer record back to
the file.
5. Process B has now written its stale account-balance value to the file, causing the changes made
by process A to be lost.
1) Readlock: If a process applies read lock on data it prevents writing to the data by another process
but allows another process to read, also called as shared lock.
2) Writelock: if a process applies write lock it prevents another process from read and write. This also
called as exclusive lock.
There are two types of locking mechanisms: mandatory and advisory. Mandatory systems will actually
prevent read()s and write()s to file.
Advisory Lock:
With an advisory lock system, processes can still read and write from a file while it's locked. Useless?
Not quite, since there is a way for a process to check for the existence of a lock before a read or write.
See, it's a kind of cooperative locking system. This is easily sufficient for almost all cases where file
locking is necessary.
Mandatory Lock:
here are two types of (advisory!) locks: read locks and write locks (also referred to as shared locks and
exclusive locks, respectively.) The way read locks work is that they don't interfere with other read locks.
For instance, multiple processes can have a file locked for reading at the same. However, when a process
has an write lock on a file, no other process can activate either a read or write lock until it is relinquished.
One easy way to think of this is that there can be multiple readers simultaneously, but there can only be
one writer at a time.
Setting a lock
The fcntl() function does just about everything on the planet, but we'll just use it for file locking. Setting
the lock consists of filling out a struct flock (declared in fcntl.h) that describes the type of lock needed,
open()ing the file with the matching mode, and calling fcntl() with the proper arguments, comme ça:
fd = open("filename", O_WRONLY);
What just happened? Let's start with the struct flock since the fields in it are used to describe the locking
action taking place. Here are some field definitions:
l_type This is where you signify the type of lock you want to set. It's either F_RDLCK,
F_WRLCK, or F_UNLCK if you want to set a read lock, write lock, or clear the
lock, respectively.
l_whence This field determines where the l_start field starts from (it's like an offset for the
offset). It can be either SEEK_SET, SEEK_CUR, or SEEK_END, for beginning of
file, current file position, or end of file.
l_start This is the starting offset in bytes of the lock, relative to l_whence.
l_len This is the length of the lock region in bytes (which starts from l_start which is
relative to l_whence.
l_pid The process ID of the process dealing with the lock. Use getpid() to gethis.
In our example, we told it make a lock of type F_WRLCK (a write lock), starting relative to SEEK_SET
(the beginning of the file), offset 0, length 0 (a zero value means "lock to end-of-file), with the PID set to
getpid().
The next step is to open() the file, since flock() needs a file descriptor of the file that's being locked. Note
that when you open the file, you need to open it in the same mode as you have specified in the lock, as
shown in the table, below. If you open the file in the wrong mode for a given lock type, fcntl() will return
-1 and errno will be set to EBADF.
l_type mode
Finally, the call to fcntl() actually sets, clears, or gets the lock. See, the second argument (the cmd) to
fcntl() tells it what to do with the data passed to it in the struct flock. The following list summarizes what
each fcntl() cmd does:
F_SETLKW This argument tells fcntl() to attempt to obtain the lock requested in the struct flock
structure. If the lock cannot be obtained (since someone else has it locked already),
fcntl() will wait (block) until the lock has cleared, then will set it itself. This is a very
useful command. I use it all the time.
F_SETLK This function is almost identical to F_SETLKW. The only difference is that this one will
not wait if it cannot obtain a lock. It will return immediately with -1. This function can be
used to clear a lock by setting the l_type field in the struct flock to F_UNLCK.
F_GETLK If you want to only check to see if there is a lock, but don't want to set one, you can use
this command. It looks through all the file locks until it finds one that conflicts with the
lock you specified in the struct flock. It then copies the conflicting lock's information into
the struct and returns it to you. If it can't find a conflicting lock, fcntl() returns the struct
as you passed it, except it sets the l_type field to F_UNLCK.
In our above example, we call fcntl() with F_SETLKW as the argument, so it blocks until it can set the
lock, then sets it and continues.
Whew! After all the locking stuff up there, it's time for something easy: unlocking! Actually, this is a
piece of cake in comparison. I'll just reuse that first example and add the code to unlock it at the end:
int fd;
Now, I left the old locking code in there for high contrast, but you can tell that I just changed the l_type
field to F_UNLCK (leaving the others completely unchanged!) and called fcntl() with F_SETLK as the
command. Easy!
Here, I will include a demo program, lockdemo.c, that waits for the user to hit return, then locks its own
source, waits for another return, then unlocks it. By running this program in two (or more) windows, you
can see how programs interact while waiting for locks.
Basically, usage is this: if you run lockdemo with no command line arguments, it tries to grab a write
lock (F_WRLCK) on its source (lockdemo.c). If you start it with any command line arguments at all, it
tries to get a read lock (F_RDLCK) on it.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>
fl.l_pid = getpid();
if (argc > 1)
fl.l_type = F_RDLCK;
printf("got lock\n");
printf("Press <RETURN> to release lock: ");
getchar();
printf("Unlocked.\n");
close(fd);
return 0;
}
mkdir :This creates a new directory with no initial contents (apart from `.' and `..').
It returns -1 on failure
readdir:Read the directory specified by the directory file descriptor. If already create and open.
struct dirent *readdir(DIR *dirp);
Objective Questions:
environment is:
a. lf b. listdir c. dir
a. Is a text file created using vi editor b. Is a text file created using a notepad
3. In the windows environment file extension identifies the application that created
it. If we remove the file extension can we still open the file?
a. Yes b. No
4. Which of the following files in the current directory are identified by the regular
expression a?b*.
5. For some file the access permissions are modified to 764. Which of the following
a. Every one can read, group can execute only and the owner can read and
write.
b. Every one can read and write, but owner alone can execute.
c. Every one can read, group including owner can write, owner alone can
execute
a. File owners’ name b. File size c. The date of last modification d. Date of file creation
c. The access permissions for the file d. All the dates of modification since the file’s creation
8. File which are linked have as many inodes as are the links.
a. True b. False
11. An indexed allocation policy affords faster information retrieval than the chained
allocation policy.
a. True b. False
12. Absolute path names begin by identifying path from the root.
a. True b. False
UNIT-IV
Processes Concepts:
A process is more than just a program. Especially in a multi-user, multi-tasking operating system
such as Linux there is much more to consider. Each program has a set of data that it uses to do
what it needs. Often, this data is not part of the program. For example, if you are using a text
editor, the file you are editing is not part of the program on disk, but is part of the process in
memory. If someone else were to be using the same editor, both of you would be using the same
program. However, each of you would have a different process in memory. See the figure below
to see how this looks graphically.
Image - Reading programs from the hard disk to create processes. (interactive)
Under Linux many different users can be on the system at the same time. In other words, they
have processes that are in memory all at the same time. The system needs to keep track of what
user is running what process, which terminal the process is running on, and what other resources
the process has (such as open files). All of this is part of the process.
With the exception of the init process (PID 1) every process is the child of another process.
Another example we see in the next figure. When you login, you normally have a single process,
which is your login shell(bash). If you start the X Windowing System, your shell starts another
process, xinit. At this point, both your shell and xinit are running, but the shell is waiting for xinit
to complete. Once X starts, you may want a terminal in which you can enter commands, so you
start xterm.
Process API
Fork():
The fork() system call will spawn a new child process which is an identical process to the parent except
that has a new system process ID. The process is copied in memory from the parent and a new process
structure is assigned by the kernel. The return value of the function is which discriminates the two threads
of execution. A zero is returned by the fork function in the child's process.
exit() vs _exit():
The C library function exit() calls the kernel system call _exit() internally. The kernel system call _exit()
will cause the kernel to close descriptors, free memory, and perform the kernel terminating process clean-
up. The C library function exit() call will flush I/O buffers and perform aditional clean-up before calling
_exit() internally. The function exit(status) causes the executable to return "status" as the return code for
main(). When exit(status) is called by a child process, it allows the parent process to examine the
terminating status of the child (if it terminates first). Without this call (or a call from main() to return())
and specifying the status argument, the process will not return a value.
vfork():
The Vfork() function is the same as fork() except that it does not make a copy of the address space. The
memory is shared reducing the overhead of spawning a new process with a unique copy of all the
memory. This is typically used when using fork() to exec() a process and terminate. The vfork()
function also executes the child process first and resumes the parent process when the child terminates.
wait(): Blocks calling process until the child process terminates. If child process has already
teminated, the wait() call returns immediately. if the calling process has multiple child processes,
the function returns when one returns.
waitpid(): Options available to block calling process for a particular child process not the first
one.
Kill():
This is the real reason to set up a process group. One may kill all the processes in the process group
without having to keep track of how many processes have been forked and all of their process id's.
The function call "execl()" initiates a new program in the same environment in which it is operating. An
executable (with fully qualified path. i.e. /bin/ls) and arguments are passed to the function. Note that
"arg0" is the command/file name to execute.
int execl(const char *path, const char *arg0, const char *arg1, const char
*arg2, ... const char *argn, (char *) 0);
Where all function arguments are null terminated strings. The list of arguments is terminated by
NULL.
The routine execlp() will perform the same purpose except that it will use environment variable PATH to
determine which executable to process. Thus a fully qualified path name would not have to be used. The
first argument to the function could instead be "ls". The function execlp() can also take the fully qualified
name as it also resolves explicitly.
This is the same as execl() except that the arguments are passed as null terminated array of pointers to
char. The first element "argv[0]" is the command name.
The routine execvp() will perform the same purpose except that it will use environment variable PATH
to determine which executable to process. Thus a fully qualified path name would not have to be used.
The first argument to the function could instead be "ls". The function execvp() can also take the fully
qualified name as it also resolves explicitly.
execve():
Zombie Process
On Linux operating systems, a zombie process or defunct process is a process that has completed
execution but still has an entry in the process table, allowing the process that started it to read its exit
status. In the term's colorful metaphor, the child process has died but has not yet been reaped.
When a process ends, all of the memory and resources associated with it are deallocated so they can be
used by other processes. However, the process's entry in the process table remains. The parent is sent a
SIGCHLD signal indicating that a child has died; the handler for this signal will typically execute the wait
system call, which reads the exit status and removes the zombie. The zombie's process ID and entry in the
process table can then be reused. However, if a parent ignores the SIGCHLD, the zombie will be left in
the process table. In some situations this may be desirable, for example if the parent creates another child
process it ensures that it will not be allocated the same process ID.
A zombie process is not the same as an orphan process. Orphan processes don't become zombie
processes; instead, they are adopted by init (process ID 1), which waits on its children.
The term zombie process derives from the common definition of zombie�an undead person.
Zombies can be identified in the output from the Unix PS command by the presence of a "Z" in the STAT
column. Zombies that exist for more than a short period of time typically indicate a bug in the parent
program. As with other leaks, the presence of a few zombies isn't worrisome in itself, but may indicate a
problem that would grow serious under heavier loads.
To remove zombies from a system, the SIGCHLD signal can be sent to the parent manually, using the kill
command. If the parent process still refuses to reap the zombie, the next step would be to remove the
parent process. When a process loses its parent, init becomes its new parent. Init periodically executes the
wait system call to reap any zombies with init as parent.
Orphan Process
An orphan process is a computer process whose parent process has finished or terminated.
A process can become orphaned during remote invocation when the client process crashes after making a
request of the server.
Orphans waste server resources and can potentially leave a server in trouble. However there are several
solutions to the orphan process problem:
1. Extermination is the most commonly used technique; in this case the orphan process is killed.
2. Reincarnation is a technique in which machines periodically try to locate the parents of any
remote computations; at which point orphaned processes are killed.
3. Expiration is a technique where each process is allotted a certain amount of time to finish before
being killed. If need be a process may "ask" for more time to finish before the allotted time
expires.
A process can also be orphaned running on the same machine as its parent process. In a UNIX-like
operating system any orphaned process will be immediately adopted by the special "init" system process.
This operation is called re-parenting and occurs automatically. Even though technically the process has
the "init" process as its parent, it is still called an orphan process since the process which originally
created it no longer exists.
Objective: how more than one process communicates with other processes sand calling functions, kernel
support
IPC:Message Queues:<sys/msg.h>
Two (or more) processes can exchange information via access to a common system message queue. The
sending process places via some (OS) message-passing module a message onto a queue which can be read
by another process . Each message is given an identification or type so that processes can select the
appropriate message. Process must share a common key in order to gain access to the queue in the first
place (subject to other permissions -- see below).
Basic Message Passing IPC messaging lets processes send and receive messages, and queue messages
for processing in an arbitrary order. Unlike the file byte-stream data flow of pipes, each IPC message has
an explicit length. Messages can be assigned a specific type. Because of this, a server process can direct
message traffic between clients on its queue by using the client process PID as the message type. For
single-message transactions, multiple server processes can work in parallel on transactions sent to a
shared message queue.
Before a process can send or receive a message, the queue must be initialized (through the msgget
function see below) Operations to send and receive messages are performed by the msgsnd() and
msgrcv() functions, respectively.
When a message is sent, its text is copied to the message queue. The msgsnd() and msgrcv() functions can
be performed as either blocking or non-blocking operations. Non-blocking operations allow for
asynchronous message transfer -- the process is not suspended as a result of sending or receiving a
message. In blocking or synchronous message passing the sending process cannot continue until the
message has been transferred or has even been acknowledged by a receiver. IPC signal and other
mechanisms can be employed to implement such transfer. A blocked message operation remains
suspended until one of the following three conditions occurs:
It can also return the message queue ID (msqid) of the queue corresponding to the key argument. The
value passed as the msgflg argument must be an octal integer with settings for the queue's permissions
and control flags.
...
key = ...
msgflg = ...
Processes requesting access to an IPC facility must be able to identify it. To do this, functions that
initialize or provide access to an IPC facility use a key_t key argument. (key_t is essentially an int type
defined in <sys/types.h>
The key is an arbitrary value or one that can be derived from a common seed at run time. One way is with
ftok() , which converts a filename to a key value that is unique within the system. Functions that initialize
or get access to messages (also semaphores or shared memory see later) return an ID number of type int.
IPC functions that perform read, write, and control operations use this ID. If the key argument is specified
as IPC_PRIVATE, the call initializes a new instance of an IPC facility that is private to the creating
process. When the IPC_CREAT flag is supplied in the flags argument appropriate to the call, the function
tries to create the facility if it does not exist already. When called with both the IPC_CREAT and
IPC_EXCL flags, the function fails if the facility already exists. This can be useful when more than one
process might attempt to initialize the facility. One such case might involve several server processes
having access to the same facility. If they all attempt to create the facility with IPC_EXCL in effect, only
the first attempt succeeds. If neither of these flags is given and the facility already exists, the functions to
get access simply return the ID of the facility. If IPC_CREAT is omitted and the facility is not already
initialized, the calls fail. These control flags are combined, using logical (bitwise) OR, with the octal
permission modes to form the flags argument. For example, the statement below initializes a new
message queue if the queue does not exist.
msqid = msgget(ftok("/tmp",
The first argument evaluates to a key based on the string ("/tmp"). The second argument evaluates to the
combined permissions and control flags.
The msgctl() function alters the permissions and other characteristics of a message queue. The owner or
creator of a queue can change its ownership or permissions using msgctl() Also, any process with
permission to do so can use msgctl() for control operations.
The msqid argument must be the ID of an existing message queue. The cmd argument is one of:
IPC_STAT
-- Place information about the status of the queue in the data structure pointed to by buf. The
process must have read permission for this call to succeed.
IPC_SET
-- Set the owner's user and group ID, the permissions, and the size (in number of bytes) of the
message queue. A process must have the effective user ID of the owner, creator, or superuser for
this call to succeed.
IPC_RMID
The following code illustrates the msgctl() function with all its various flags:
#include<sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
...
if (msgctl(msqid, IPC_STAT, &buf) == -1) {
perror("msgctl: msgctl failed");
exit(1);
}
...
if (msgctl(msqid, IPC_SET, &buf) == -1) {
perror("msgctl: msgctl failed");
exit(1);
}
...
The msgsnd() and msgrcv() functions send and receive messages, respectively:
int msgflg);
int msgflg);
The msqid argument must be the ID of an existing message queue. The msgp argument is a pointer to a
structure that contains the type of the message and its text. The structure below is an example of what this
user-defined buffer might look like:
struct mymsg {
The structure member msgtype is the received message's type as specified by the sending process.
The argument msgflg specifies the action to be taken if one or more of the following are true:
• The total number of messages on all queues system-wide is equal to the system-imposed limit.
• If (msgflg & IPC_NOWAIT) is non-zero, the message will not be sent and the calling process
will return immediately.
• If (msgflg & IPC_NOWAIT) is 0, the calling process will suspend execution until one of the
following occurs:
o The condition responsible for the suspension no longer exists, in which case the message
is sent.
o The message queue identifier msqid is removed from the system; when this occurs, errno
is set equal to EIDRM and -1 is returned.
o The calling process receives a signal that is to be caught; in this case the message is not
sent and the calling process resumes execution.
Upon successful completion, the following actions are taken with respect to the data structure
associated with msqid:
o msg_qnum is incremented by 1.
o msg_lspid is set equal to the process ID of the calling process.
o msg_stime is set equal to the current time.
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
...
...
if (msgp == NULL) {
exit(1);
...
msgsz = ...
msgflg = ...
...
msgsz = ...
msgtyp = first_on_queue;
msgflg = ...
...
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
#include <stdio.h>
#include <string.h>
/*
* Declare the message structure.
*/
main()
{
int msqid;
int msgflg = IPC_CREAT | 0666;
key_t key;
message_buf sbuf;
size_t buf_length;
/*
* Get the message queue id for the
* "name" 1234, which was created by
* the server.
*/
key = 1234;
/*
* We'll send message type 1
*/
sbuf.mtype = 1;
buf_length = strlen(sbuf.mtext) + 1 ;
/*
* Send a message.
*/
if (msgsnd(msqid, &sbuf, buf_length, IPC_NOWAIT) < 0) {
printf ("%d, %d, %s, %d\n", msqid, sbuf.mtype, sbuf.mtext, buf_length);
perror("msgsnd");
exit(1);
}
else
printf("Message: \"%s\" Sent\n", sbuf.mtext);
exit(0);
}
• The Message queue is created with a basic key and message flag msgflg = IPC_CREAT | 0666
-- create queue and make it read and appendable by all.
• A message of type (sbuf.mtype) 1 is sent to the queue with the message ``Did you get this?''
The full code listing for message_send.c's companion process, message_rec.c is as follows:
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
#include <stdio.h>
/*
* Declare the message structure.
*/
main()
{
int msqid;
key_t key;
message_buf rbuf;
/*
* Get the message queue id for the
* "name" 1234, which was created by
* the server.
*/
key = 1234;
/*
* Receive an answer of message type 1.
*/
if (msgrcv(msqid, &rbuf, MSGSZ, 1, 0) < 0) {
perror("msgrcv");
exit(1);
}
/*
* Print the answer.
*/
printf("%s\n", rbuf.mtext);
exit(0);
}
• The Message queue is opened with msgget (message flag 0666) and the same key as
message_send.c.
• A message of the same type 1 is received from the queue with the message ``Did you get this?''
stored in rbuf.mtext.
UNIT-V
CONTENT OVERVIEW:
Semaphores are not used to exchange a large amount of data. Semaphores are
used synchronization among processes. Other synchronization mechanisms include record locking and
mutexes. Why necessary? Examples include: shared washroom, common rail segment, and common bank
account.
Further comments:
1) The semaphore is stored in the kernel: Allows atomic operations on the semaphore. Processes are
prevented from indirectly modifying the value.
2) A process acquires the semaphore if it has a value of zero. The value of the semaphore is then
incremented to
1). When a process releases the semaphore, the value of the semaphore is decremented.
3) If the semaphore has non-zero value when a process tries to acquire it, that process blocks.
4) In comments 2 and 3, the semaphore acts as a customer counter. In most cases, it is a resource counter.
5) When a process waits for a semaphore, the kernel puts the process “to sleep” until the semaphore is
available. This is better (more efficient) than busy waiting such as TEST&SET.
6) The kernel maintains information on each semaphore internally, using a data structure struct semis_ds
that keeps track of permission, number of semaphores, etc.
7) Apparently, a semaphore in Unix is not a single binary value, but a set of nonnegative integer values
sem_perms
structure
sem_base
sem_nsems
sem_otime
struct semid_ds
sem_ctime
semval [0]:semaphore value, nonnegative
Binary semaphore – have a value of 0 or 1. Similar to a mutex lock. 0 means locked; 1 means unlocked.
semaphore – has a value ≥ 0. Used for counting resources, like the producer-consumer
example. Note that value =0 is similar to a lock (resource not available). Set of counting semaphores –
one or more semaphores, each of which is a counting semaphore.
concurrent process, critical region, shared resource, deadlock, mutual exclusion, primitive, atomic
operation.
int semget(key_t key, int nsems, int semflag); -- returns int semid;
opsptr — points to an array of one or more operations. Each operation is defined as:
};
More notes:
As a customer counter, a semaphore is acquired doing the first two operations in one call; a semaphore is
released using the third operation. See the following program example.
Blocking calls end when the request is satisfied, the semaphore set is deleted, or a signal is received.
Keep in mind: all operations in one semop( ) must be finished atomically by the kernel. Either all or
int semctl(int semid, int semnum, int cmd, union semum arg ); ─ Return value depends on cmd, -1 on
error.
} arg;
cmd --- IPC_RMID to remove a semaphore set. Union semun arg is not used in this case.
GETVAL /SETVAL to fetch/set a specific value. semnum can specify a member of the semaphore
set.
arg.val =1;
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#define SEMKEY 123456L /* key value for semget() */
#define PERMS 0666
static struct sembuf op_lock[2]= { 0, 0, 0, /* wait for sem #0 to become 0 */
0, 1, SEM_UNDO /* then increment sem #0 by 1 */ };
static struct sembuf op_unlock[1]= { 0, -1, (IPC_NOWAIT | SEM_UNDO)
/* decrement sem #0 by 1 (sets it to 0) */ };
int semid = -1; /* semaphore id. Only the first time will create a semaphore.*/
my_lock( )
{ if (semid <0) {
if ( ( semid=semget(SEMKEY, 1, IPC_CREAT | PERMS )) < 0 ) printf(“semget error”); }
if (semop(semid, &op_lock[0], 2) < 0) printf(“semop lock error”);
}
my_unlock( )
{
if (semop(semid, &op_unlock[0], 1) < 0) printf(“semop unlock error”);
}
Questions: how to rewrite the above program to make the semaphore as a resource counter? What if the
resource
allows 3 or more processes to use at the same time?
Solutions:
static struct sembuf op_lock[1]= { 0, -1, SEM_UNDO, };
static struct sembuf op_unlock[1]= { 0, 1, (IPC_NOWAIT | SEM_UNDO) }
Don’t forget to set the initial value of the semaphore as 1 or 3.
Shared Memory
Using a pipe or a message queue requires multiple exchanges of data through the kernel.
Shared memory can be used to bypass the kernel for faster processing.
The kernel maintains information about each shared memory segment, including permission, size, access
time, etc
in struct shmid_ds .
shmflag --- same as for msgget() and semget(), see Lecture 4’s Comments 1.
shmid --- return value of shmget, that is, the id of the created shared memory.
returns the starting address of the shared memory, and thus we can read/write on the shared memory after
getting its starting address.
shmaddr --- the return value of shmat(), that is, the starting address of the shared memory.
returns –1 on failure.
Process
Thread
• pthread_t
• pthread_mutex_t - Mutex
• pthread_cond_t - Condition variable
• pthread_key_t - Access key for thread data.
• pthread_attr_t - Thread attributes
• pthread_mutexattr_t - Mutex attributes
• pthread_condattr_t - Condition variable attributes
• pthread_once_t - One time initialization
1. pthread_t
2. int pthread_create (pthread_t *thread, const pthread_attr_t *attr, void *(*start)(void *), void
*arg); - The thread identifier which is needed to do anything with the thread is the value returned
in the first argument, *thread . The third argument is the address of the thread routine to run.
3. pthread_t pthread_self (void);
4. int pthread_detach (pthread_t thread); - Allows the system resources for the thread to be released
when the thread exits.
5. int pthread_join (pthread_t thread, void **value_ptr); - Blocks until the thread specified
terminates. It will optionally store the return value of the terminated thread.
6. int pthread_exit (void *value_ptr);
7. int pthread_equal (pthread_t thr1, pthread_t thr2); - Returns 0 value if the threads are not equal
and non-zero if they are the same thread.
8. pthread_t pthread_self (void); - Allows a thread to get its own identifier.
Thread States
Thread Identification
Just as a process is identified through a process ID, a thread is identified by a thread ID. But
interestingly, the similarity between the two ends here.
• A process ID is unique across the system where as a thread ID is unique only in context of a
single process.
• A process ID is an integer value but the thread ID is not necessarily an integer value. It could well
be a structure
• A process ID can be printed very easily while a thread ID is not easy to print.
The above points give an idea about the difference between a process ID and thread ID.
Thread ID is represented by the type ‘pthread_t’. As we already discussed that in most of the
cases this type is a structure, so there has to be a function that can compare two thread IDs.
#include <pthread.h>
int pthread_equal(pthread_t tid1, pthread_t tid2);
So as you can see that the above function takes two thread IDs and returns nonzero value if both
the thread IDs are equal or else it returns zero.
Another case may arise when a thread would want to know its own thread ID. For this case the
following function provides the desired service.
#include <pthread.h>
pthread_t pthread_self(void);
So we see that the function ‘pthread_self()’ is used by a thread for printing its own thread ID.
Now, one would ask about the case where the above two function would be required. Suppose
there is a case where a link list contains data for different threads. Every node in the list contains
a thread ID and the corresponding data. Now whenever a thread tries to fetch its data from linked
list, it first gets its own ID by calling ‘pthread_self()’ and then it calls the ‘pthread_equal()’ on
every node to see if the node contains data for it or not.
An example of the generic case discussed above would be the one in which a master thread gets
the jobs to be processed and then it pushes them into a link list. Now individual worker threads
parse the linked list and extract the job assigned to them.
Thread Creation
Normally when a program starts up and becomes a process, it starts with a default thread. So we
can say that every process has at least one thread of control. A process can create extra threads
using the following function :
#include <pthread.h>
int pthread_create(pthread_t *restrict tidp, const pthread_attr_t *restrict attr, void
*(*start_rtn)(void), void *restrict arg)
The above function requires four arguments, lets first discuss a bit on them :
• The first argument is a pthread_t type address. Once the function is called successfully, the
variable whose address is passed as first argument will hold the thread ID of the newly created
thread.
• The second argument may contain certain attributes which we want the new thread to contain. It
could be priority etc.
• The third argument is a function pointer. This is something to keep in mind that each thread starts
with a function and that functions address is passed here as the third argument so that the kernel
knows which function to start the thread from.
• As the function (whose address is passed in the third argument above) may accept some
arguments also so we can pass these arguments in form of a pointer to a void type. Now, why a
void type was chosen? This was because if a function accepts more than one argument then this
pointer could be a pointer to a structure that may contain these arguments.
Thread Example
Following is the example code where we tried to use all the three functions discussed above.
#include<stdio.h>
#include<string.h>
#include<pthread.h>
#include<stdlib.h>
#include<unistd.h>
pthread_t tid[2];
if(pthread_equal(id,tid[0]))
{
printf("\n First thread processing\n");
}
else
{
printf("\n Second thread processing\n");
}
for(i=0; i<(0xFFFFFFFF);i++);
return NULL;
}
int main(void)
{
int i = 0;
int err;
while(i < 2)
{
err = pthread_create(&(tid[i]), NULL, &doSomeThing, NULL);
if (err != 0)
printf("\ncan't create thread :[%s]", strerror(err));
else
printf("\n Thread created successfully\n");
i++;
}
sleep(5);
return 0;
}
output :
$ ./threads
Thread created successfully
First thread processing
Thread created successfully
Second thread processing
As we have seen above, Xinu processes all execute in the same address space, and do not incur
the overhead of switching to the kernel address space during context switching. Such processes
are called lightweight processes ( lwps), since they have little associated state and share
memory with each other and the process manager, making context switches, process creation,
and interprocess communication relatively inexpensive. These processes are to contrasted with
Unix-like heavyweight processes ( hwps), which run in separate address spaces and switch to
the kernel address space on context switches. Lightweight and heavyweight processes are
complementary concepts in that one can run multiple lightweight processes inside a heavyweight
process. In fact, your assignments will be doing exactly this, creating Xinu lwps within a Unix
hwp.
Thread Attributes:
• By default, a thread is created with certain attributes. Some of these attributes can be changed by
the programmer via the thread attribute object.
• pthread_attr_init and pthread_attr_destroy are used to initialize/destroy the thread attribute object.
• Other routines are then used to query/set specific attributes in the thread attribute object.
Attributes include:
o Detached or joinable state
o Scheduling inheritance
o Scheduling policy
o Scheduling parameters
o Scheduling contention scope
o Stack size
o Stack address
o Stack guard (overflow) size
POSIX Thread:
• Threads use and exist within these process resources, yet are able to be scheduled by the
operating system and run as independent entities largely because they duplicate only the bare
essential resources that enable them to exist as executable code.
• This independent flow of control is accomplished because a thread maintains its own:
o Stack pointer
o Registers
o Scheduling properties (such as policy or priority)
o Set of pending and blocked signals
o Thread specific data.
• So, in summary, in the UNIX environment a thread:
o Exists within a process and uses the process resources
o Has its own independent flow of control as long as its parent process exists and the OS
supports it
o Duplicates only the essential resources it needs to be independently schedulable
o May share the process resources with other threads that act equally independently (and
dependently)
o Dies if the parent process dies - or something similar
o Is "lightweight" because most of the overhead has already been accomplished through the
creation of its process.
• Because threads within the same process share resources:
o Changes made by one thread to shared system resources (such as closing a file) will be
seen by all other threads.
o Two pointers having the same value point to the same data.
o Reading and writing to the same memory locations is possible, and therefore requires
• The original Pthreads API was defined in the ANSI/IEEE POSIX 1003.1 - 1995 standard. The
POSIX standard has continued to evolve and undergo revisions, including the Pthreads
specification.
• Copies of the standard can be purchased from IEEE or downloaded for free from other sites
online.
• The subroutines which comprise the Pthreads API can be informally grouped into four major
groups:
1. Thread management: Routines that work directly on threads - creating, detaching,
joining, etc. They also include functions to set/query thread attributes (joinable,
scheduling etc.)
2. Mutexes: Routines that deal with synchronization, called a "mutex", which is an
abbreviation for "mutual exclusion". Mutex functions provide for creating, destroying,
locking and unlocking mutexes. These are supplemented by mutex attribute functions that
set or modify attributes associated with mutexes.
3. Condition variables: Routines that address communications between threads that share a
mutex. Based upon programmer specified conditions. This group includes functions to
create, destroy, wait and signal based upon specified variable values. Functions to
set/query condition variable attributes are also included.
4. Synchronization: Routines that manage read/write locks and barriers.
• Naming conventions: All identifiers in the threads library begin with pthread_. Some examples
are shown below.
pthread_mutex_ Mutexes
• The concept of opaque objects pervades the design of the API. The basic calls work to create or
modify opaque objects - the opaque objects can be modified by calls to attribute functions, which
deal with opaque attributes.
• The Pthreads API contains around 100 subroutines. This tutorial will focus on a subset of these -
specifically, those which are most likely to be immediately useful to the beginning Pthreads
programmer.
• For portability, the pthread.h header file should be included in each source file using the
Pthreads library.
• The current POSIX standard is defined only for the C language. Fortran programmers can use
wrappers around C function calls. Some Fortran compilers (like IBM AIX Fortran) may provide a
Fortram pthreads API.
• A number of excellent books about Pthreads are available. Several of these are listed in the
References section of this tutorial.
Thread Management
Routines:
pthread_create (thread,attr,start_routine,arg)
pthread_exit (status)
pthread_cancel (thread)
pthread_attr_init (attr)
pthread_attr_destroy (attr)
Creating Threads:
• Initially, your main() program comprises a single, default thread. All other threads must be
explicitly created by the programmer.
• pthread_create creates a new thread and makes it executable. This routine can be called any
number of times from anywhere within your code.
• pthread_create arguments:
o thread: An opaque, unique identifier for the new thread returned by the subroutine.
o attr: An opaque attribute object that may be used to set thread attributes. You can specify
a thread attributes object, or NULL for the default values.
o start_routine: the C routine that the thread will execute once it is created.
Thread Attributes:
• By default, a thread is created with certain attributes. Some of these attributes can be changed by
the programmer via the thread attribute object.
• pthread_attr_init and pthread_attr_destroy are used to initialize/destroy the thread attribute
object.
• Other routines are then used to query/set specific attributes in the thread attribute object.
Attributes include:
o Detached or joinable state
o Scheduling inheritance
o Scheduling policy
o Scheduling parameters
o Scheduling contention scope
o Stack size
o Stack address
o Stack guard (overflow) size
• Some of these attributes will be discussed later.
• The Pthreads API provides several routines that may be used to specify how threads are
scheduled for execution. For example, threads can be scheduled to run FIFO (first-in first-out),
RR (round-robin) or OTHER (operating system determines). It also provides the ability to set a
thread's scheduling priority value.
• These topics are not covered here, however a good overview of "how things work" under Linux
can be found in the sched_setscheduler man page.
• The Pthreads API does not provide routines for binding threads to specific cpus/cores. However,
local implementations may include this functionality - such as providing the non-standard
pthread_setaffinity_np routine. Note that "_np" in the name stands for "non-portable".
• Also, the local operating system may provide a way to do this. For example, Linux provides the
sched_setaffinity routine.
#include <pthread.h>
#include <stdio.h>
#define NUM_THREADS 5
int rc;
long t;
for(t=0; t<NUM_THREADS; t++){
printf("In main: creating thread %ld\n", t);
rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t);
if (rc){
printf("ERROR; return code from pthread_create() is %d\n", rc);
exit(-1);
}
}
Introduction
1. No Synchronization
2. Synchronization with Monitor
3. Synchronization with Semaphore
1. No Synchronization
With no synchronization, all threads run simultaneously and execute the same piece of code
simultaneously. There is no restriction on how many threads can access it. Following is the code:
}
}
public void AccessCode()
{
listBox1.BeginInvoke(new ParameterizedThreadStart(UpdateUI), new object[]
{"Thread ID : " + Thread.CurrentThread.ManagedThreadId.ToString() + ": Entered" }
);
Thread.Sleep(500);
listBox1.BeginInvoke(new ParameterizedThreadStart(UpdateUI), new object[]
{ "Thread ID : " + Thread.CurrentThread.ManagedThreadId.ToString() + " : Exit" }
);
}
Synchronization with monitor class, only one thread can access the same resource at a time.
Thread is run simultaneously but it can access the block of code one at a time. There is a
restriction on thread so that only a single thread can access a particular code block.
In synchronization with semaphore class, we can allow more than one thread to access the same
block of code. Actually, we can specify how many threads can access the same block of code at
the same time.
new object[]
{"Thread ID : " +
Thread.CurrentThread.ManagedThreadId.ToString() + " : Entered" });
Thread.Sleep(500);
}
finally
{
l_SemaPhore.Release();
listBox1.BeginInvoke(new ParameterizedThreadStart(UpdateUI),
new object[]
{ "Thread ID : " +
Thread.CurrentThread.ManagedThreadId.ToString() + " : Exit" });
IsComplete = true;
}
}
else
{
listBox1.BeginInvoke(new ParameterizedThreadStart(UpdateUI),
new object[]
{"Thread ID :"+Thread.CurrentThread.ManagedThreadId.ToString() +
": Waiting To enter"});
}
}
}
public void UpdateUI(object objOutput)
{
listBox1.Items.Add(objOutput.ToString());
}
Socket Programming
Objective: This Unit gives introduction about network programming and discusses about socket creation
in connection oriented and connection less network.
CONTENT OVERVIEW:
To a programmer a socket looks and behaves much like a low level file descriptor. This is because
commands such as read() and write() work with sockets in the same way they do with files and pipes. The
differences between sockets and normal file descriptors occurs in the creation of a socket and through a
variety of special operations to control a socket.
Sockets were first introduced in 2.1BSD and subsequently refined into their current form with 4.2BSD.
The sockets feature is now available with most current UNIX system releases.
A Unix Socket is used in a client server application frameworks. A server is a process which does some
function on request from a client. Most of the application level protocols like FTP, SMTP and POP3
make use of Sockets to establish connection between client and server and then for exchanging data.
Socket Types:
There are four types of sockets available to the users. The first two are most commenly used and last two
are rarely used.
Processes are presumed to communicate only between sockets of the same type but there is no restriction
that prevents communication between sockets of different types.
• Stream Sockets: Delivery in a networked environment is guaranteed. If you send through the
stream socket three items "A,B,C", they will arrive in the same order - "A,B,C". These sockets
use TCP (Transmission Control Protocol) for data transmission. If delivery is impossible, the
sender receives an error indicator. Data records do no have any boundaries.
• Datagram Sockets: Delivery in a networked environment is not guaranteed. They're
connectionless because you don't need to have an open connection as in Stream Sockets - you
build a packet with the destination information and send it out. They use UDP (User Datagram
Protocol).
• Raw Sockets: provides users access to the underlying communication protocols which support
socket abstractions. These sockets are normally datagram oriented, though their exact
characteristics are dependent on the interface provided by the protocol. Raw sockets are not
intended for the general user; they have been provided mainly for those interested in developing
new communication protocols, or for gaining access to some of the more esoteric facilities of an
existing protocol.
• Sequenced Packet Sockets: They are similar to a stream socket, with the exception that record
boundaries are preserved. This interface is provided only as part of the Network Systems (NS)
socket abstraction, and is very important in most serious NS applications. Sequenced-packet
sockets allow the user to manipulate the Sequence Packet Protocol (SPP) or Internet Datagram
Protocol (IDP) headers on a packet or a group of packets either by writing a prototype header
along with whatever data is to be sent, or by specifying a default header to be used with all
outgoing data, and allows the user to receive the headers on incoming packets.
Most of the Net Applications use the Client Server architecture. These terms refer to the two processes or
two applications which will be communicating with each other to exchange some information. One of the
two processes acts as a client process and another process acts as a server.
Client Process:
This is the process which typically makes a request for information. After getting the response this
process may terminate or may do some other processing.
For example: Internet Browser works as a client application which sends a request to Web Server to get
one HTML web page.
Server Process:
This is the process which takes a request from the clients. After getting a request from the client, this
process will do required processing and will gather requested information and will send it to the requestor
client. Once done, it becomes ready to serve another client. Server process are always alert and ready to
serve incoming requests.
For example: Web Server keeps waiting for requests from Internet Browsers and as soon as it gets any
request from a browser, it picks up a requested HTML page and sends it back to that Browser.
Notice that the client needs to know of the existence and the address of the server, but the server does not
need to know the address or even the existence of the client prior to the connection being established.
Once a connection is established, both sides can send and receive information.
• 2-tier architectures: In this architecture, client directly interact with the server. This type of
architecture may have some security holes and performance problems. Internet Explorer and Web
Server works on two tier architecture. Here security problems are resolved using Secure Socket
Layer(SSL).
• 3-tier architectures: In this architecture, one more software sits in between client and server. This
middle software is called middleware. Middleware are used to perform all the security checks and
load balancing in case of heavy load. A middleware takes all requests from the client and after
doing required authentication it passes that request to the server. Then server does required
processing and sends response back to the middleware and finally middleware passes this
response back to the client. If you want to implement a 3-tier architecture then you can keep any
middle ware like Web Logic or WebSphere software in between your Web Server and Web
Browsers.
Types of Server:
• Iterative Server: This is the simplest form of server where a server process serves one client and
after completing first request then it takes request from another client. Meanwhile another client
keeps waiting.
• Concurrent Servers: This type of server runs multiple concurrent processes to serve many request
at a time. Because one process may take longer and another client can not wait for so long. The
simplest way to write a concurrent server under Unix is to fork a child process to handle each
client separately.
The system calls for establishing a connection are somewhat different for the client and the server, but
both involve the basic construct of a socket. The two processes each establish their own sockets.
The steps involved in establishing a socket on the client side are as follows:
The steps involved in establishing a socket on the server side are as follows:
This tutorial describes the core socket functions required to write a complete TCP client and server.
To perform network I/O, the first thing a process must do is call the socket function, specifying the type
of communication protocol desired and protocol family etc.
#include <sys/types.h>
#include <sys/socket.h>
This call gives you a socket descriptor that you can use in later system calls or it gives you -1 on error.
Parameters:
family: specifies the protocol family and is one of the constants shown below:
Family Description
This tutorial does not talk about other protocols except IPv4.
type: specifies kind of socket you want. It can take one of the following values:
Type Description
protocol: argument should be set to the specific protocol type given below or 0 to select the system's
default for the given combination of family and type:
Protocol Description
The connect function is used by a TCP client to establish a connection with a TCP server.
#include <sys/types.h>
#include <sys/socket.h>
This call returns 0 if it successfully connects to the server otherwise it gives you -1 on error.
Parameters:
The bind function assigns a local protocol address to a socket. With the Internet protocols, the protocol
address is the combination of either a 32-bit IPv4 address or a 128-bit IPv6 address, along with a 16-bit
TCP or UDP port number. This function is called by TCP server only.
#include <sys/types.h>
#include <sys/socket.h>
This call returns 0 if it successfully binds to the address otherwise it gives you -1 on error.
Parameters:
• my_addr is a pointer to struct sockaddr that contains local IP address and port.
• addrlen set it to sizeof(struct sockaddr).
A 0 value for port number means system will choose a random port and INADDR_ANY value for IP
address means server's IP address will be assigned automatically.
server.sin_port = 0;
server.sin_addr.s_addr = INADDR_ANY;
NOTE: As descript in Ports and Services tutorials, all ports bellow 1024 are reserved. So you can set a
port above 1024 and bellow 65535 unless the ones being used by other programs.
The listen function is called only by a TCP server and it performs two actions:
• The listen function converts an unconnected socket into a passive socket, indicating that the
kernel should accept incoming connection requests directed to this socket.
• The second argument to this function specifies the maximum number of connections the kernel
should queue for this socket.
#include <sys/types.h>
#include <sys/socket.h>
Parameters:
The accept function is called by a TCP server to return the next completed connection from the front of
the completed connection queue. If the completed connection queue is empty, the process is put to sleep.
• The listen function converts an unconnected socket into a passive socket, indicating that the
kernel should accept incoming connection requests directed to this socket.
• The second argument to this function specifies the maximum number of connections the kernel
should queue for this socket.
#include <sys/types.h>
#include <sys/socket.h>
This call returns non negative descriptor on success otherwise it gives you -1 on error. The returned
decriptor is assumed to be a client socket descriptor and all read write operations will be done on this
descripton to communicate with the client.
Parameters:
The send function is used to send data over stream sockets or CONNECTED datagram sockets. If you
want to send data over UNCONNECTED datagram sockets you must use sendto() function.
You can use write() system call to send the data. This call is explained in helper functions tutorial.
int send(int sockfd, const void *msg, int len, int flags);
This call returns the number of bytes sent out otherwise it will return -1 on error.
Parameters:
The recv function is used to receive data over stream sockets or CONNECTED datagram sockets. If you
want to receive data over UNCONNECTED datagram sockets you must use recvfrom().
You can use read() system call to read the data. This call is explained in helper functions tutorial.
int recv(int sockfd, void *buf, int len, unsigned int flags);
This call returns the number of bytes read into the buffer otherwise it will return -1 on error.
Parameters:
The sendto function is used to send data over UNCONNECTED datagram sockets. Put simply, when you
use scoket type as SOCK_DGRAM
int sendto(int sockfd, const void *msg, int len, unsigned int flags,
This call returns the number of bytes sent otherwise it will return -1 on error.
Parameters:
The recvfrom function is used to receive data from UNCONNECTED datagram sockets. Put simply,
when you use scoket type as SOCK_DGRAM
int recvfrom(int sockfd, void *buf, int len, unsigned int flags
This call returns the number of bytes read into the buffer otherwise it will return -1 on error.
Parameters:
The close function is used to close the communication between client and server.
Parameters:
The shutdown function is used to gracefully close the communication between client and server. This
function gives more control in caomparision of close function.
Parameters:
The select function indicates which of the specified file descriptors is ready for reading, ready for writing,
or has an error condition pending.
When an application calls recv or recvfrom it is blocked until data arrives for that socket. An application
could be doing other useful processing while the incoming data stream is empty. Another situation is
when an application receives data from multiple sockets.
Calling recv or recvfrom on a socket that has no data in it's input queue prevents immediate reception of
data from other sockets. The select function call solves this problem by allowing the program to poll all
the socket handles to see if they are available for non-blocking reading and writing operations.
Parameters:
• nfds: specifies the range of file descriptors to be tested. The select() function tests file descriptors
in the range of 0 to nfds-1
• readfds:points to an object of type fd_set that on input specifies the file descriptors to be checked
for being ready to read, and on output indicates which file descriptors are ready to read. Can be
NULL to indicate an empty set.
• writefds:points to an object of type fd_set that on input specifies the file descriptors to be checked
for being ready to write, and on output indicates which file descriptors are ready to write Can be
NULL to indicate an empty set.
• exceptfds :points to an object of type fd_set that on input specifies the file descriptors to be
checked for error conditions pending, and on output indicates which file descriptors have error
conditions pending. Can be NULL to indicate an empty set.
• timeout :poins to a timeval struct that specifies how long the select call should poll the descriptors
for an available I/O operation. If the timeout value is 0, then select will return immediately. If the
timeout argument is NULL, then select will block until at least one file/socket handle is ready for
an available I/O operation. Otherwise select will return after the amount of time in the timeout
has elapsed OR when at least one file/socket descriptor is ready for an I/O operation.
The return value from select is the number of handles specified in the file descriptor sets that are ready for
I/O. If the time limit specified by the timeout field is reached, select return 0. The following macros exist
for manipulating a file descriptor set:
• FD_CLR(fd, &fdset): Clears the bit for the file descriptor fd in the file descriptor set fdset
• FD_ISSET(fd, &fdset): Returns a non-zero value if the bit for the file descriptor fd is set in the
file descriptor set pointed to by fdset, and 0 otherwise.
• FD_SET(fd, &fdset): Sets the bit for the file descriptor fd in the file descriptor set fdset.
• FD_ZERO(&fdset): Initializes the file descriptor set fdset to have zero bits for all file descriptors.
The behavior of these macros is undefined if the fd argument is less than 0 or greater than or equal to
FD_SETSIZE.
Example:
fd_set fds;
tv.tv_sec = 1;
tv.tv_usec = 500000;
FD_ZERO(&fds);
FD_SET(sock, &fds);
if (FD_ISSET(sock, &fds))
/* do something */
else
/* do something else */
There are various structures which are used in Unix Socket Programming to hold information about the
address and port and other information. Most socket functions require a pointer to a socket address
structure as an argument. Structures defined in this tutorial are related to Internet Protocol Family.
struct sockaddr{
char sa_data[14];
};
This is a generic socket address structure which will be passed in most of the socket function calls. Here
is the description of the member fields:
sa_family AF_INET This represents an address family. In most of the Internet based
AF_UNIX applications we use AF_INET.
AF_NS
AF_IMPLINK
sa_data Protocol Specific The content of the 14 bytes of protocol specific address are interpreted
Address according to the type of address. For the Internet family we will use port
number IP address which is represented by sockaddr_in structure defined
below.
Second structure that helps you to reference to the socket's elements is as follows:
struct sockaddr_in {
};
sa_family AF_INET This represents an address family. In most of the Internet based applications
AF_UNIX we use AF_INET.
AF_NS
AF_IMPLINK
sin_zero Not Used You just set this value to NULL as this is not being used.
The next structure is used only in the above structure as a structure field and holds 32 but netid/hostid.
struct in_addr {
};
There is one more important structure. This structure is used to keep information related to host.
struct hostent
char *h_name;
char **h_aliases;
int h_addrtype;
int h_length;
char **h_addr_list
};
h_name ti.com etc This is official name of the host. For example tutorialspoint.com, google.com
etc.
h_addrtype AF_INET This contains the address family and in case of Internet based application it will
always be AF_INET
h_length 4 This will hold the length of IP address which is 4 for Internet Address.
h_addr_list in_addr For the Internet addresses the array of pointers h_addr_list[0], h_addr_list[1] and
so on are points to structure in_addr.
Following structure is used to keep information related to service and associated ports.
struct servent
char *s_name;
char **s_aliases;
int s_port;
char *s_proto;
};
s_name http This is official name of the service. For example SMTP, FTP POP3 etc.
s_aliases ALIAS This will hold list of service aliases. Most of the time this will be set to NULL.
s_port 80 This will have associated port number. For example for HTTP this will be 80.
s_proto TCP This will be set to the protocol used. Internet services are provided using either TCP
UDP or UDP.
Additional Topics
1)Pipes:
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
int main()
{
FILE *write_fp;
char buffer[BUFSIZ + 1];
sprintf(buffer, “Once upon a time, there was …\n”);
write_fp = popen (“od –c”, “w”);
if (write_fp != NULL){
fwrite( buffer , sizeof(char), strlen(buffer), write_fp);
pclose(write_fp);
exit(EXIT_SUCCESS);
}
exit(EXIT_FAILURE);
exit(EXIT_SUCCESS);
}
exit(EXIT_FAILURE);
exit(EXIT_FAILURE)
}
}
printf(“Process %d opening FIFO O_WRONLY\n”,getpid());
pipe_fd = open(FIFO_NAME, open_mode);
printf(“Process %d result %d\n”, getpid(),pipe_fd);
}