PERL On Unix/Linux: Practical Extraction and Reporting Language

Download as pdf or txt
Download as pdf or txt
You are on page 1of 135

PERL on Unix/Linux

Practical Extraction and Reporting Language

Contents
Introduction Scalar Variables and Lists Arrays and Hashes Operators and Precedence Conditional statements and Loops Regular Expressions Subroutines File and Directory Handling

History
Developed by Larry Wall in 1987 Derives from the ubiquitous C programming language and to a lesser extent from sed, awk, the Unix Shell. PERL was originally designed under Unix, but now also runs under all OS(Including Windows).

Introduction
What is PERL?
Interpreted language that is optimized for string manipulation, I/O and system tasks.

Why PERL?
Speed of development Don't have to compile create object file and then execute. Power of flexibility of a high programming language. Easy to use, freely available and portable. Makes easy jobs easy, without making hard jobs impossible.

Beginning with Perl


perldoc perl gives the list of manual pages as a part of every Perl installation. perldoc h gives the brief summary of options available. perl v gives the version of the Perl the user is using. To create a Perl program, only a text editor and the perl interpreter are required. Perl file ends with .pl (simple.pl)

Beginning with Perl (Contd..)


Execution Command : perl filename.pl, or ./filename.pl When Unix has to execute Perl Script It first looks for #!(Shebang) , it executes the remainder of the line and passes the name of the script to it as an argument. #! /usr/bin/perl or #!/usr/local/bin/perl is the command used to run the Perl Interpreter. So to start a script we need to add above line as the first line to make Perl script executable.

Beginning with Perl (Contd..)


The core of Perl is Perl Interpreter the engine that actually interprets, compiles, and runs Perl scripts. All Perl programs go through two phases: a compile phase where the syntax is checked and the source code, including any modules used, is converted into bytecode. a run-time phase where the bytecode is processed into machine instructions and executed.

Man Pages
Man command used to read the documentation.
Command perl perldelta perlfaq perltoc perlsyn perlop perlre perlfunc perlsub perlvar Description Overview (top level) Changes since last version Frequently asked questions Table of contents for Perl documentation Perl Syntax Operators and precedence Perl Regular Expression Built in functions Subroutines Predefined Variables

Basic Syntax
# is used for commenting the line. All statements should end with ;.

$_ is the special variable called default variable. Perl is case sensitive. Perl program is compiled and run in a single operation.

Simple Programs
Example #! /usr/local/bin/perl # Directs to perl interpreter on the system. print Welcome to perl; # Prints a message on the output. This Program displays: Welcome to perl Example: #! /usr/local/bin/perl c Print welcome to perl; This Program displays: Syntax ok

Basic Options
-c -v -w : Check syntax and exit without executing the script. : Prints the version of perl executable. : Prints warnings

-e : Used to enter and execute a line of script on the command line

Standard Files
STDIN STDOUT STDERR : It is a normal input channel for the script. : It is an normal output channel. : It is the normal output channel for errors.

Standard Files (Contd..)


Example #! /usr/local/bin/perl w print Enter the Text; $input = <STDIN> ; #Reads the input and stores in the variable input Chomp(); #will remove new line character. Print entered text =$input ; #Prints the input on the command line\ This Program displays: Enter the Text Perl is awesome

#Perl will read this Perl is awesome\n, by default it will add \n character to your entered text. So use chomp

entered text =Perl is awesome

Variables
Variables are used to refer data which is held as value. Perl defines hashes. three basic data types: scalars, arrays, and

Scalars :
Holds a single value it may be a string ,number or reference. Begin with $, followed by a letter then by letters, digits or underscores. Example : $var =1 $var = Hello_world $var=2.65 $3var = 123 # integer # string # Decimal number #Error, Shouldnt start with #number

Variable Interpolation
Interpolation takes place only in double quotation marks.
Example #! /usr/local/bin/perl w $x = 12 ; print Value of x is $x ; This Program displays: Value of x is $x

#Assign the value to the variable #Prints the output

#Single quotation will not interpolate #(no processing is done) the values

Example #! /usr/local/bin/perl w $x = 12 ; print Value of x is $x ; This Program displays: Value of x is 12

#Assign the value to the variable #Prints the output

#Double quotation interpolates the values. #(Variable is replaced by its content )

Integers
Integers are usually expressed as decimal(10) but can be specified in several different formats. 234 0765 0b1101 0xcae decimal integer octal integer binary integer hexadecimal integer

Converting a number from one base to another base can be done using sprintf function. Variables of different base can be displayed using printf function

Integers (Contd..)
Example #! /usr/local/bin/perl w $bin = 0b1010; $hex = sprint f %x, $bin; $oct = sprint f %o ,45; print binary =$bin \n hexa =$hex \n octal =$oct; This Program displays: binary= 1010 hexa = a octal = 55

Integers (Contd..)
Example #! /usr/local/bin/perl w $x = 98 ; print f ( Value in decimal =%d\n, $x ) ; print f ( Value in octal=%o\n, $x ) ; print f ( Value in binary =%b\n, $x ) ; print f ( Value in hexadecimal=%x\n, $x ) ; This Program displays: Value in decimal =98 Value in octal =142 Value in binary =1100010 Value in hexadecimal =62

Escaped Sequences
Character strings that are enclosed in double quotes accept escape sequences for special characters. The escape sequences consist of a backslash (\) followed by one or more characters
Escape Sequence \b \e \f \l \L \u \U \r \v Description Backspace escape Form feed Forces the next letter into lowercase All following letters are lower case Forces the next letter into upper case All following letters are upper case Carriage Return Vertical Tab

Built-in functions
Function chomp( ) Description The chomp() function will remove (usually) any newline character from the end of a string. The reason we say usually is that it actually removes any character that matches the current value of $/ (the input record separator), and $/ defaults to a newline. Ex :chomp($text); The chop() function will remove the last character of a string (or group of strings) regardless of what that character is. Ex:chop($text) Returns the character represented by that number in the character set Ex: chr(65 ) gives A. Returns the ASCII numeric value of the character specified by expression. Ex:ord(A) gives 65.

Chop( )

Chr ()

Ord()

Lists
List is a group of scalar used to initialize array or hash. The elements of a list can be numbers, strings or any other types of scalar data. Each element of the Perl lists can be accessed by a numerical index. The elements of a list are enclosed in a pair of round parenthesis and are generally separated by commas.

Lists (Contd..)
Example : $var = welcome # normal variable #first list

$var2 = (12,24,kacper, $var,36.48)

#first list contains 5 elements and two of are strings(kacper, welcome) $var3 = (12,24,Kacper ,$var ,36.48) #second list #second list contains 5 elements and two of are strings(kacper, $var)

Lists (Contd..)
Flexible way of defining list is qw (quote word) operator which helps avoiding to too many quotation marks, but be cautious if white spaces are there.
Example #! /usr/local/bin/perl w print (sachin ,dravid, ganguly, kumble , \n); print qw(sachin dravid ganguly kumble); print \n; print (sachin, dravid , ganguly, anil kumble ,\n); print( k, a, qw(c p e r ),t , e, c , h ); print \n; print (sachin ,dravid, ganguly, kumble )[1]; This Program displays: Sachindravidgangulykumble Sachindravidgangulykumble Sachindravidgangulyanil kumble kacpertech dravid

Lists (Contd..)
Difference between list and array(or hash) is, array is a variable that can be initialized with a list. Range Operator Defined by symbol .. Used to create a list from a range of letters or numbers.
Example : Print (2 .. 4 ** a .. d ) This program displays: 234**abcd

Lists (Contd..)
List functions
A list is joined into a string using join function.
Example: Print join( ,(perl ,is ,a, scripting ,language)); This program displays: Perl is a scripting language

A string is splited into a list using split function.


Example: Print (split |, perl) ; This program displays: P|e|r|l

Lists (Contd..)
List functions (Cont..)
map evaluates expression or block for each element of list.
Example: Print join(, ,(map lc, A, B, C)); This program displays a, b, c

grep returns a sublist of a list for which a specific criterion is true.


Example: Print grep(!/x/ , a, b, x, d); This program displays: abd

Arrays
An one dimensional ordered list of scalar variables. Array provides dynamic storage for a list ,and so can be shrunk ,grown ,and manipulated by altering values. Represented using @(at) symbol. Array without a name is called a list. Elements of an array are accessed using the index number (first element has index zero, next has one, and so on) Each element in an array is a scalar. $#array holds the last index value in the array.

Arrays (Contd..)
( ) represents the empty list.
Example: @arr= (perl ,2 , 5.143 ); print @ arr; This program displays: perl 2 5.143

#displays all the elements

Example: @num = (1,2,3,4,5) ; print \@num has ($#num + 1) elements; This program displays @num has 5 elements

#Displays (last index number i.e, 4 + 1) which is #the length of array

Arrays (Contd..)
Example #! /usr/local/bin/perl w @a =(a .. z) ; @len1 =@a; @len2 =scalar (@arr); Print length of a =$len1 \n ; Print length of a =$len2; This Program displays: length of a = 26 length of a = 26 Example #! /usr/local/bin/perl w @arr = ( one ,2 ,three ,4.4); $arr[2] = kacper; print @arr; This Program displays: one 2 kacper 4.4 #Assign the array to the variable #using scalar method

#second element(three) is replaced by new element(kacper)

Array Methods
push
Push function adds a value or values to the end of an array.
Example: @num = (1,2,3,4,5) ; Push (@num , 6) ; Print @num ; This program displays: 123456 #pushes 6 at the end of array

#Displays all the elements of array.

pop
Pop function gets a value or values from an array.
#Removes the last element of an array Example: @num = (1,2,3,4,5) ; Pop (@num) ; Print @num ; This program displays: 1234

#Displays all the elements of array.

Array Methods (Contd..)


unshift
unshift function adds a value or values at the start of an array.
Example: @num = (1,2,3,4,5) ; unshift (@num , 6) ; Print @num ; This program displays: 612345 #Adds 6 at the beginning of array

#Displays all the elements of array.

Array Methods (Contd..)


shift
shift function shifts off the first value of the array.
Example: @num = (1,2,3,4,5) ; $x = shift(@num) ; Print $x ; This program displays: 2,3,4,5

#Shifts the first element of an array

#Displays the value stored in x.

Array Methods (Contd..)


map
Array processing method converts one array to another. Syntax : map Expression(or Block) , list Runs an expression on each element of an array(like loop) Locally assigns $_ as an alias to the current array item.
Example: @small = qw( one ,two, three) ; @caps = map (uc ,@small); #uc returns an upper case version print ( @val) ; This program displays: ONE TWO THREE #Displays in upper case

Array Methods (Contd..)


map (Contd..)
Example: @num = (65, 66,67 ,68) ; @num2 = map(2*$_ , @num); @char = map(chr $_ ,@num); that number print @num2 \n @char\n; This program displays: 130 132 134 136 A B C D # multiplies each element by 2 #chr returns the character represented by

Array Methods (Contd..)


Array Slice
Array slice is a section of an array.
Example: @num = (1 ,2 ,3 ,4 ,5) ; @val = @num[0 ,1]; print join (, , @val) ; This program displays: 1 ,2

#Array slice of first two element of @num

#Displays first two elements of @num.

Array Methods (Contd..)


Array Splice
Array splicing means adding elements from a list to the array.
Example: @num = (1 ,2 ,3 ,4 ,5) ; @val =(6 , 7); splice(@num,4 ,0 , @val) ; Print join( ,,@num); This program displays: 1 ,2 ,3 ,4 , 5 , 6 , 7

#Adds the element of @val to @num

#Displays all the elements of @num after #splicing.

Array Methods (Contd..)


sort
Sorts the elements in the ASCII order. Defines the global variables $a and $b by default ,using these we can specify our own sort.
Example: @str = qw(sachin dravid ganguly kumble) ; @val =( 56,13,45,11); @str_sort 1 =sort ( @str); @val_sort1 =sort (@val); print @str_sort 1 \n; Print @val_sort1 ; This program displays: dravid ganguly kumble sachin 11 13 45 56

Array Methods (Contd..)


sort (Contd..)
Example: #!usr /local/bin/perl @str = qw(sachin dravid ganguly kumble) ; @val =( 56,13,45,11); @str_sort2 = sort($a cmp $b); #sorted in alphabetical order @str_rev = sort($b cmp $a); # sorted in reverse order @val_sort2 = sort($a <=> $b); # sorted in ascending order @val_rev = sort($b , < = > $a); #sorted in descending order print @str_sort2 \t @str_rev\n; print @val_sort2 \t val_rev \n; This program displays: dravid ganguly kumble sachin sachin 11 13 45 56 56 45 13 11

kumble ganguly

dravid

Array Methods (Contd..)


join
Perl join function is used to concatenate the elements of an array or a list into a string, using a separator given by a scalar variable value. Syntax : $string = join (EXPR, LIST);
Example: #!usr/local/bin/perl @arr = ("mukesh ,"anil ,"prem ,"ratan"); $arr = join " \t", @arr; print "business Tycoons: $arr\n"; print join "-CEO\t", @arr, "\n; This program displays: business Tycoons: mukesh anil prem ratan mukesh-CEO anil-CEO prem-CEO

ratan-CEO

Array Methods (Contd..)


Array Reversal
Reverse function is used to reverse the elements of an array.
Example: @num = (1 ,2 ,3 ,4 ,5) ; @rev =reverse @num ; Print join( ,, @rev) ; This program displays: 5 ,4 ,3 ,2 ,1

#Reversing the elements of @num

#Displays all the elements of @rev.

spilt
Split() function is the opposite of join function. Syntax :LIST = split(/PATTERN/, EXPR, LIMIT)

Array Methods (Contd..)


LIST represents a list, array or hash that is returned by the split function PATTERN usually is a regular expression but could be a single character or a string EXPR is the string expression that will be split into an array or a list. LIMIT is the maximum number of fields the EXPR will be split into
Example: !usr/local/bin/perl $string = gandhi-ind-nehru-ind-sastri-ind-kalam-ind; @colors = split('ind', $string); print @colors,"\n"; This program displays: gandhi--nehru--sastri--kalam-

Hashes
An associative array ideal for handling attribute/value pair. Lists and arrays are ordered and accessed by index ,hashes are ordered and accessed by specified key. Represented using % symbol. First element in each row is called a Key and the second element is a Value associated with that key. Example : %coins = (quarter,25, dime,5); or %coins = ( quarter => 25 , dime => 5); Key Value

Hashes (Contd..)
Hah values can be any scalar ,just like an array ,but hash keys can only be strings.
Example: Printing the hash. #!usr/local/bin/perl %hash1 = ( one => 1 ,two => 2 ,three =>3 ,four =>4); print %hash1; #we cant use print %hash1; print @{[hash1]} \n; @temp = %hash1; Print @temp; This program displays: three3one1two2four4 three 3 one 1 two 2 four 4 three 3 one 1 two 2 four 4

The print order determined by how the Perl chooses to store internally.

Hashes (Contd..)
Hash can have only scalars as values. { } are used to access individual elements of the hash.
Example: #!usr/local/bin/perl %hash1 = ( one => 1 ,two => 2 ,three =>3 ,four =>4); $ele = $hash1(three); #single key, use scalar @mul_ele = @hash1(four ,one); #multiple key ,use array print single element =$ele ; print multiple elements =@mul_ele; This program displays: single element =3 multiple elements = 4 1

Hashes (Contd..)
keys function can be used to find the no. of keys and list of entries in a hash. values function can be used to find the no. of values list of values in a hash.
Example: #!usr/local/bin/perl %hash1 = ( one => 1 ,two => 2 ,three =>3 ,four =>4); $ele = $hash1(three); #single key, use scalar @mul_ele = @hash1(four ,one); #multiple key ,use array print single element =$ele ; print multiple elements =@mul_ele; This program displays: single element =3 multiple elements = 4 1

Manipulating Hashes
To add or change the value key we can do like this $hash1{ three } = PERL . It will overwrite the previous value if already existing. Otherwise it is added as a new key. undef function is used to remove the value of the key, but key will still exists. Example: undef $hash1{ two} ; delete function is used to remove the value as well as key from the hash . Example :delete $hash1 {four };

Hash Sorting
Hashes are not ordered and we must not rely on the order in which we added the hash items Perl uses internally its own way to store the items. We can sort hashes either by function.
Example: Sort by key %data = ( sachin => 10, dravid => 19, dhoni => 7, rohit => 45 ); foreach $key(sort (keys(%data))) { print \t$key \t $data{$key};} This program displays: dhoni 7 dravid

key or value ,using sort

19

rohit

45

sachin

10

Hash Sorting (contd..)


Sort function returns least (or greatest) element among all elements in the first iteration.
Example: Sort by value %data = ( sachin => 10, dravid => 19, dhoni => 7, rohit => 45 ); foreach $key (sort{$data {$a} <= > $data{$b}} keys %data) { print \t $key \t\t $data{$key} \n; } This program displays: dhoni 7 sachin

10

dravid

19

rohit

45

In the above example first values are compared(using sort{$data {$a} <= > $data{$b}} ), found least value and that is assigned to key (using keys% data) in every iteration

Operators
Operators can be broadly divided into 4 types. Unary operator which takes one operand. Example: not operator i.e. ! Binary operator which take two operands Example: addition operator i.e. + Ternary operator which take three operands. Example: conditional operator i.e. ?: List operator which take list operands Example: print operator

Arithmetic Operators
Operator + * / ++ -% ** Description Adds two numbers Subtracts two numbers Multiplies two numbers Divides two numbers Increments by one.(same like C) Decrements by one Gives the remainder (10%2 gives five) Gives the power of the number. Print 2**5 ; #prints 32.

Shift Operators
shift operators manipulate integer values as binary numbers, shifting their bits one to the left and one to the right respectively.
Operator << >> x Description Left Shift Print 2 >>3 ; left shift by three positions, prints 8 Right Shift Print 42 >>2; #right shift by two positions, prints 10

Repetition Operator. Ex: print hi x 3; Output : hihihi Ex2: @array = (1, 2, 3) x 3; #array contains(1,2,3,1,2,3,1,2,3) Ex3 :@arr =(2)x80 #80 element array of value 2

Logical Operators
Logical operators represented by either symbols or names. These two sets are identical in operation, but have different precedence.
Operator && or AND || or OR XOR ! or NOT Description Return True if operands are both True Return True if either operand is True Return True if only one operand is True (Unary) Return True of operand is False

The ! operator has a much higher precedence than even && and || . The not, and, or, and xor operators have the lowest precedence of all Perl's operators, with not being the highest of the four

Bitwise Operators
Bitwise operators treat their operands as binary values and perform a logical operation between the corresponding bits of each value.
Operator & | ^ ~ Description Bitwise AND Bitwise OR Bitwise XOR Bitwise NOT

Comparison Operators
The comparison operators are binary, returning a value based on a comparison of the expression
Operator < > == <= >= <= > Description Lessthan Greaterthan Equality Lessthan or equal Greaterthan or equal It does not return a Boolean value. It returns -1 if left is less than right 0 if left is equal to right 1 if left is greater than right Inequality operator

!=

Comparison Operators on strings


String eq le ge gt gt cmp Description Return True if operands are equal Return True if left operand is less than right Return True if left operand is greater or equal to right Return True if left operand is less than or equal to right Return True if left operand is greater than right It does not return a Boolean value. It returns -1 if left is less than right 0 if left is equal to right 1 if left is greater than right Return True if operands are not equal Concatenation operator. It takes two strings and joins them Ex: print System .Verilog It prints SystemVerilog.

ne .(dot)

Binding operator
The binding operator ,=~ ,binds a scalar expression into a pattern match. String operations like s/// ,m//,tr/// work with $_ by default. By using these operators you can work on scalar variable other than $_ . The value returned from =~ is the return value of the regular expression function, returns undef if match failed. The !~ operator performs a logical negation of the returned value for conditional expressions, that is 1 for failure and '' for success in both scalar and list contexts.

Conditional Statements
if Statement
if keyword to execute a statement block based on the evaluation of an expression or to choose between executing one of two statement blocks based on the evaluation of an expression
Example :$firstVar = 2; if ($var == 1) { print we are in first if \n; } } elsif( $var ==2) { print we are in second if \n; else { print we are in third if\n; } This program displays: we are in second if

Conditional Statements (Contd..)


until Loops
Until loops are used to repeat a block of statements while some condition is false.
Example :- do until loop $firstVar = 10; do { print("inside: firstVar = $firstVar\n"); $firstVar++; } until ($firstVar < 2); print("outside: firstVar = $firstVar\n");

Conditional Statements (Contd..)


do-until Loops
Example :- until loop $firstVar = 10; until ($firstVar < 20) { print("inside: firstVar = $firstVar\n"); $firstVar++; }; print("outside: firstVar = $firstVar\n"); This program displays : outside: firstVar = 10

Conditional Statements (Contd..)


for Loops
Example : - for loops for ($firstVar = 0; $firstVar < 100; $firstVar++) { print("inside: firstVar = $firstVar\n"); } This program will display: inside: firstVar = 0 inside: firstVar = 1 ... inside: firstVar = 98 inside: firstVar = 99

Conditional Statements (Contd..)


foreach Loops
The foreach statement is used solely to iterate over the elements of an array. It is very handy for finding the largest element, printing the elements, or simply seeing if a given value is a member of an array.
Example :- foreach loop @array = (1..5, 5..10); print("@array\n"); foreach (@array) { print("@array\n"); This program displays: 1 2 3 4 5 5 6 7 8 9 10 1 2 3 4 ** ** 6 7 8 9 10 $_ = ** " if ($_ == 5); }

Jump Keywords
The last Keyword
The last keyword is used to exit from a statement block.
Example :- last @array = ("A".."Z"); for ($index = 0; $index < @array; $index++) { if ($array[$index] eq "T") { { last } } print("$index\n"); This program displays: 19

Jump Keywords (Cont..)


The next Keyword
The next keyword use to skip the rest of the statement block and start the next iteration.
Example : - next keyword @array = (0..9); print("@array\n"); for ($index = 0; $index < @array; $index++) { if ($index == 3 || $index == 5) { next; } } $array[$index] = "*"; print("@array\n");

This program displays: 0123456789 * * *3 *5 ****

Jump Keywords (Cont..)


The redo Keyword
The redo keyword causes Perl to restart the current statement block.
Example :- redo print("What is your name? "); $name = <STDIN>; chop($name); if (! length($name)) { print("Msg: Zero length input. Please try again\n"); redo; } print("Thank you, " . uc($name) . "\n"); }

Regular Expressions
Regular expression(regexps) is simply a string that describes the pattern (example for pattern ,to find files in a directory which ends with .sv i.e. ls *.sv ) Used for finding and extracting patterns within the text. The role of regexp engine is to take a search pattern and apply it to the supplied text. The following operators use regular expressions. Matching Operator (m//) Substitution Operator(s///) Transliteration(Translation) Operator(tr///)

The Matching Operator (m//)


The matching operator (m//) is used to find patterns in strings.
Example : #!usr/local/bin/perl $_ = success is a progressive journey; $var = success is not a destination; If( /success/) { # the initial m is optional print String success Found; } If ( $var =~ /destination/) { print String destination Found; } This program displays: String success Found String destination Found

The Matching Operator (Contd..)


When regular expression is enclosed in slashes(/success/), $_ is tested against the regular expression ,returning TRUE if there is a match , false otherwise

Finding multiple matches


Example : #!usr/local/bin/perl $txt =winn-ers see ga-in, lose-rs see pa-in while ($txt =~ m/-/g) { print Found another -\n ;} This program displays: found another found another found another -

The Substitution Operator (s///)


The substitution operator (s///) is used to change strings. Syntax :LVALUE =~ s/PATTERN/REPLACEMENT/ The return value of an s/// operation (in scalar and list contexts alike) is the number of times it succeeded (which can be more than once if used with the /g modifier). On failure, since it substituted zero times, it returns false (""), which is numerically equivalent to 0. If PATTERN is a null string, the last successfully executed regular expression is used instead

The Substitution Operator (Contd..)


Example : #!usr/local/bin/perl $text = winners see gain, losers see pain; $test =~s/winners/WINNERS/; print $text; This program displays: WINNERS see gain, losers see pain Example : #!usr/local/bin/perl @arr = qw(sachin dravid ganguly sachin); foreach(@arr){ # for(@arr) and s/sachin/10/g for @arr s/sachin/10/; } do same thing print "\n@arr; This program displays: 10 dravid ganguly 10

The Substitution Operator (Contd..)


Example : #!usr/local/bin/perl @old = qw(sachin-bharat dravid-bharat ganguly-bharat kumble-ind); for (@new = @old) { s/bharat/india/ } print "@olds\n; print "@new\n"; This program displays: sachin-bharat dravid-bharat sachin-india dravid-india

ganguly-bharat kumble-ind ganguly-india kumble-ind

Using Modifiers with m// and s///


Modifier g(m//g or s///g) i x gc s m e o Description

Works globally to perform all possible operations. Ignores alphabetic case Ignores white space in pattern and allows comments. Doesnt reset the search position after a failed match Lets the . Character match newlines. Lets ^ and $ match embedded \n characters Evaluate right hand side as an expression Compiles the pattern only once

The Translation Operator (tr///)


Syntax : LVALUE =~ tr/SEARCHLIST/REPLACEMENTLIST/ It scans a string, character by character, and replaces each occurrence of a character found in SEARCHLIST (which is not a regular expression)with the corresponding character from REPLACEMENTLIST It returns the number of characters replaced or deleted. If no string is specified via the =~ or !~ operator, the $_ string is altered.

The Translation Operator (Contd..)


Modifier c (tr///c) d s Description Complements the search list. Deletes unreplaced characters Deletes duplicate replaced characters

Example : #!usr/local/bin/perl $text = winners see gain, losers see pain; $count = ($test =~tr/e/E/); print $text; Print \n no..of replacements =$count; This program displays: winnErs sEE gain,losErs sEE pain no..of replacements =6

The Translation Operator (Contd..)


Example : #!usr/local/bin/perl $text = winners see gain, losers see pain; $count = ($test =~tr/e/E/c); print $text; #except e all other characters Print \n no..of replacements =$count; # are replaced. This program displays: EEEEeEEEEeeEEEEEEEEEeEEEEeeEEEEE no..of replacements =26

Different Pattern Delimiters


If the pattern contains lots of slash characters(/) ,we can also use different pattern delimiter with the pattern.
Example : #!usr/local/bin/perl $var = "winners / see / gain,losers / see pain"; If( $var =~ m|see|) { # match with pipes print String see Found\n; } If ( $var =~ m ?gain?) { #match with question marks. print String gain Found; } This program displays: String see Found String gain Found

Perl also allows paired characters like brackets.viz { } ,( ), < >,[ ]. Ex: $var =~s{gain}{GAIN};

braces

and

The Parts of regular Expressions


In general regular expression can be made up of following parts. Characters Character Classes Alternative Match Patterns Quantifiers Assertions

Characters
In regular expression any single character matches itself, unless it is a metacharacters with special meaning. Beside normal characters, Perl defines special characters that you can use in regular expression. These character must start with backslash.(Otherwise Perl treats it as a normal character).
Character . (period) Description Used to match any single character except newline character Ex :$var1 = ~ /r.n/ # will match run , ran, ron

Characters (Contd..)
character Description

\d

It is equivalent to [0 - 9] Matches any digit. Ex1 : $var =~ /\d/ # Will match any digit. It is equivalent to [^0 - 9] Matches any non-digit. Ex1 : $var =~ /\D/ # Will match any non-digit.

\D

Characters (Contd..)
character Description

\w

It is equivalent to [0-9a-zA-Z_] Matches a word character allowable in Perl variable name. i.e. Match any 'word' or alphanumeric character, which is the set of all upper and lower case letters, the numbers 0..9 and the underscore character _ Ex :if ( $var =~ /\w/) It is equivalent to [^0-9a-zA-Z_] Matches any non-word characters. Inverse of \w Ex :if ($var =~ /\W)

\W

Characters (Contd..)
character Description

\s

It is equivalent to [ \t\n\r] Matches any white space character. i.e. a space ,a tab ,a newline ,a return Ex :if ($var =~ /\s/) It is equivalent to [^ \t\n\r] Matches any non-white space character. Ex :if ($var =~ /\S/)

\S

Characters (Contd..)
character Description

\Q

Quote(disable) pattern metacharacters until \E found. Ex:#usr/local/bin/perl $var = success is not a *; If($var =~/*/){ print found in 1st if ; } If($var ~=/\Q*\E) { print found in 2nd if ; } It will display : found in 2nd if

\E

End case modification.

Characters (Contd..)
character Description

\U

Change the following characters to upper case until a \E sequence is encountered. Ex:$var = SUCCESS is not a * ; If($var =~/success/){ print found in 1st if ; } If($var ~=/\Usuccess\E) { print found in 2nd if ; } It will display: found in 2nd if

\L

Change the following characters to lower case until a \E sequence is encountered. Same like \U

Characters (Contd..)
character Description

\u

Change the next character to uppercase. Ex:#usr/local/bin/perl $var = SUCCESS is not a *; If($var =~/\us/){ print found only s ; } If($var ~=/\usu/) { print found su ; } It will display : found only s

\l

Change the next character to lower case.

Character Classes
A character class allows a set of possible characters, rather than just a single character, to match at a particular point in a regular expression. Character classes are denoted by brackets [...], with the set of characters to be possibly matched inside. Matches one occurrence of any character inside the bracket Ex 1: $var =~ /w[aoi]nder/ # will match wander, wonder, winder

Character Classes (Contd..)


If you use ^ as the first character in a(if you use ^ outside the character class[ ] it works as anchor) character class, then that character class matches any character not in the class. Ex1:$var = ~/w[^aoi]nder # will look for w followed by something that is # none of a or o or i.

Alternative Match Patterns


Alternative Match Pattern means that you can specify a series of alternatives for a pattern using | to separate them. |(called alternation) is equivalent to an or in regular expression. It is used to give a chance. Ex: $var2 =~ /hope|trust/ # will match either hope or trust

Alternatives are checked from left to right, so the first alternative that matches is the one thats used.

Grouping Alternatives
Grouping[ ( ) ] allows parts of a regular expression to be treated as a single unit. Parts of a regular expression are grouped by enclosing them in parentheses. Used to group similar terms by their common characters and only specified the differences. Example : $var2 =~ /(while |for)loop/ # will match either while loop or for loop

Grouping Alternatives (Contd..)


The pairs of parentheses are numbered from left to right by the positions of the left parentheses. Perl places the text that is matched by the regular expression in the first pair of parentheses into the variable $1, and the text matched by the regular expression in the second pair of parentheses into $2,and so on.
Example : #!usr/local/bin/perl my $text= "Testing"; if ($text =~ /((T|N)est(ing|er))/) { print " \$1 = $1 \t \$2 = $2 \t \$3 = $3 \n "; This program displays: $1 = Testing $2 = T $3 = ing

Grouping Alternatives (Contd..)


There are three pairs of parentheses in the above example. The first one is that which surrounds the whole regular expression, hence $1 evaluates to the whole matched text, which is Testing. The match caused by the second pair of parentheses (T|N), which is T, is assigned to $2. The third pair of parentheses (ing|er) causes $3 to be assigned the value ing.

Quantifiers
Quantifiers says how many times something may match, instead of the default of matching just once. You can use quantifier to specify that a pattern must match a specific number of times. Quantifiers in a regular expression are like loops in a program.

Quantifiers (Contd..)
character Description

It indicates that the string Immediately to the left should be matched zero or more times in order to be evaluated as a true. Ex1 : $var =~ /st*/ # Will match for the strings like st, sttr, sts , star, son . The regexp a* will search for a followed by either a or any other character. It matches all strings which contain the character a. It indicates that the string Immediately to the left should be matched one or more times in order to be evaluated as a true. Ex:$var =~ /st*/ # Will match for the strings like st, sttr, sts ,star , but not son.

Quantifiers (Contd..)
character Description

{}

It indicates that how many times the string immediately to the left should be matched. {n} should match exactly n times. {n,} should match at least n times {n, m} Should match at least n times but not more than m times. Ex : $var =~ /mn{2,4}p/ # will match mnnp, mnnnp, mnnnnp . It indicates that the string Immediately to the left should be matched zero or one times in order to be evaluated as a true. Ex : $var =~ /st?r/ # will match either star or sttr. $var = ~/comm?a/ # will match either coma or comma

Quantifiers (Contd..)
Quantifiers are greedy by default, which means they will try to match as much as they can. Ex :$str =they are players, arent they ? $str =~s/.*are/were/; print $str; It will print :werent they ? Perl will use the *. Preceding are to match all the characters upto the last are in the str.

Making Quantifiers Less Greedy


To make Quantifiers less greedy that is ,to match the minimum number of times possible you follow the quantifier with a ?

*?
+? ?? {n}? {n,}? {m,n}

Matches zero or more times.


Matches one or more times. Matches zero or one times. Matches n times. Matches at least n times Matches at least n times but more than m times.

Making Quantifiers Less Greedy (Contd.)


Example : #!usr/local/bin/perl $text = They are players ,arent they?; $text =~s/.*?are/were/; print $text; This program displays: Were players ,arent they? Example : #!usr/local/bin/perl $txt = no, these are the documents, over there.; $txt = ~ s/the(.*?)e/those/; print $txt; This program displays: no, those are the documents, over there

Assertions
Assertions (also called anchors) used to match conditions within a string, not actual data. Assertions are zero width because they do not consume any characters when they match.

Anchor ^(caret)

Description Appears at the beginning of the pattern and finds for a match at beginning of the line Ex : $var =~ /^su/ # Will match the strings those are starting with su i.e. . sun, success, super ..

Assertions (Contd..)
character Description

Appears at the end of the pattern and finds for a match at end of the line Ex : $var =~ /at$/ # Will match the strings those ends with at i.e. . cat, rat , beat Matches only at the end of a string, or before a new line at the end. It matches at the end of the match text, before the newline if any is present. Otherwise it is the same as \z. Matches only at the end of string.

\Z

\z

Assertions (Contd..)
Character \A \G Description Matches only at the beginning of a string.(Similar to ^) It applies when we use a regular expression to produce multiple matches using the g pattern modifier. It re-anchors the regular expression at the end of the previous match, so that previously matched text takes no part in further matches. Works only with /g .

The difference between ^ and \A is that when you use the m multiline-modifier, ^ matches the beginning of every line, but \A retains its original meaning and matches only at the very beginning of the whole string.

Assertions (Contd..)
Word Boundaries
\b matches on a word boundary. This occurs whenever a word character is adjacent to a nonword character . It is equivalent to \w\W|\W\w . Within character classes \b represents backspace rather than a word boundary, just as it normally does in any doublequoted string.

Assertions (Contd..)
Word Boundaries
\B matches on a non-word boundary . This occurs whenever two word characters or two non-word characters fall adjacent to each other. It is equivalent to \w\w|\W\W.

Assertions (Contd..)
Word Boundaries
Example : #!usr/local/bin/perl $text = "one, ****, three, four"; $text1 = "one,****,three, four"; foreach ($text =~ /\b\w+\b/g){ print $_, "\t"; } print "\n using \\B\n"; foreach ($text1 =~ /\B\w+\B/g){ print $_, "\t"; } This program displays: one three four using \B n hre ou # * is not a word # character

Regular Expressions-Examples(1)
character Description

$var =~m/^\./

Will match for dot(.) at the beginning of the statement. ^ Used to match at the beginning of the line ,dot is a meta character so it has to preceded by \ Will match a word, a nonempty sequence of alphanumeric characters and underscores such as trust , 12bar8 and kac_per. The strings start and end optionally separated by any amount of white space (spaces, tabs, newlines).

$var =~ /\w+/

$var =~/start\s*end/

Regular Expressions-Examples(2)
character Description

$var =~/o\.m/

It will match exactly o.m

$var =~ Will match either bluecolor or bluecolour /blue(colo(ur|r))/

$var =~ s/\s+$// Removes(trims) the Trailing white space.

Regular Expressions-Examples(3)
character Description

$var =~ s/^\s+//

Removes(trims) the leading white space. Ex:$txt = trust in god; $txt ==~ s/^\s+// Print $txt; It will print : trust in god Will match complete first number. Ex :$txt = "day = 86400s or 1440 mor 24 h"; if($txt =~ m/(\d+)/){ print "\n\nFirst Number is $1"; } It will print : First Number is 86400

$var =~ m/(\d+)/

Subroutines
Subroutine is a separate body of code designed to perform a particular task. It is same as function in C language. The Idea behind subroutines is that old programming dictum divide and conquer. Subroutines allow you to divide your code into manageable parts, which makes overall programming easier to handle. Perl allows you to create subroutines using the sub control structure

Subroutines (Contd..)
Example : #!usr/local/bin/perl $v1 =36; large_small(); #subroutine call before definition , parentheses must sub large_small{ if($v1 >40) { print "value is bigger than 40\n"; } else{ print "value is smaller than 40\n"; } } $v1 =45; large_small; # subroutine call after definition , parentheses are # optional

This program displays: value is smaller than 40 value is bigger than 40

Scope of variable
Perl variables have global package scope by default. When we change a variable value in subroutine, well be changing it in the rest of the program by mistake as well. We can create a variables that are entirely local to subroutine by using keyword local or my. They can have same name as global variable and not affect those global variables at all.

Scope of variable (Contd..)


Example : #!usr/local/bin/perl $v1 =36; incr(); print "value of v1 =$v1\n"; sub incr { my $var =$v1; print "value before incrementing =$var\n"; $var++; print "value after incrementing =$var\n"; } This program displays: value before incrementing =36 value after incrementing =37 value of v1 =36

Parameters and Arguments


You can pass values to subroutine by placing in parentheses. (ex: incr($v1);) When you pass values to a subroutine ,those values are stored in a special array named @_. Beside accepting passed values, subroutines can also return values using return keyword.

Parameters and Arguments (Contd..)


Example : #!usr/local/bin/perl $sum =add(10,20); print "sum =$sum\n; sub add { ($val1, $val2) = @_; return $val1+$val2; This program displays: sum =30

Perl returns the last value in a sub routine so you can omit the return keyword . In the above example $val1+$val2 gives the same.

Parameters and Arguments (Contd..)


Different ways of reading arguments passed to subroutine Sub add { $val1 =$_[0]; $val2 =$_[1]; } Sub add { $val1 =shift@_ ; (or) shift ; $val2 =shift@_ ; (or) shift ; } In a subroutine ,shift uses @_ by default so you can use shift directly.

Recursion
Recursion happens when a subroutine calls itself, either directly, or indirectly, via another subroutine
Example : #!usr/local/bin/perl0 $fact = fact(6); print "factorial of given number =$fact\n; sub fact { local $val =shift(@_); if($var ==0) { return 1; } elsif($val==1) { return $val ; } else{ return $val*fact($val-1); This program displays: factorial of given number =720

}}

Subroutines Examples(1)
Passing Lists
Example : #!usr/local/bin/perl @small= qw(sachin dravid ganguly ); @big =case_convert(@small); print "@big"; sub case_convert { @low =@_; @caps =map(uc ,@low); } This program displays: SACHIN DRAVID GANGULY

Subroutines Examples(2)
Nested subroutines
Example : #!usr/local/bin/perl call(); sub call{ display(); sub display { print "you are in inner subroutine\n" ; print "you are in outer subroutine\n" ; } print "you are in main\n" ; This program displays: you are in inner subroutine you are in outer subroutine you are in main

Pass by Reference
In general passing arrays or hashes flattens their elements into one long list, so its a problem if you want to send two or more distinct arrays or hashes. To preserve integrity, you can pass references to arrays or hashes instead. References can be created by using a backslash (\) operator on a variable. It is similar to address-of (&)operator in C. \$var \@arr \%hash Scalar Reference Array Reference Hash Reference

Pass by Reference (Contd..)


If you pass \$a (a reference to the $a scalar variable) to a subroutine, then in the subroutine the variable that receives that parameter receives a reference (or a "pointer") pointing to the $a scalar Dereferencing references(Using prefixes $, @ ,% ,->) $$ref_var Scalar Dereference $$ref_arr Array Dereference #array is copied into scalar( ref_arr) $$ref_hash Hash Dereference

Pass by Reference (Contd..)


Example : #!usr/local/bin/perl @arr = qw(America England france); print "before: arr = " . join(', ', @arr) . "\n"; change (\@arr); print "after: arr = " . join(', ', @arr) . "\n; sub change { my $ref_arr =shift; $$ref_arr[0]= "China"; @{$ref_arr}[1] ="India"; # { } creates a block. $ref_arr ->[2] ="Japan"; # -> called arrow operator. } This program displays: before: arr = America, England, france after: arr = China, India, Japan

Returning by Reference
If you return two arrays normally, their values are flattened into one long list. If you return references to arrays, you can deference those arrays and reach the original arrays.
Example : #!usr/local/bin/perl sub get_strings{ @str1 = qw(Asia Austrelia Africa); @str2 =qw(Brown White Black); return \@str1 ,\@str2; } ($ref_str1 ,$ref_str2) =get_strings; print "@$ref_str1 \n"; print "@$ref_str2 \n"; This program displays: Asia Austrelia Africa Brown White Black

You can get lot of material on references that comes with Perl( perldoc perl)

File Handling
A filehandle is nothing more than a nickname for the files you intend to use in your PERL scripts and programs. Filehandles are a connection between our program and an external data source Filehandles in Perl are a distinct data type. STDIN or standard input represents the default input filehandle and usually connected to the keyboard. STDOUT or Standard output represents the default output filehandle and usually connected to the console device(screen)

File Handling (Contd..)


STDERR or Standard error is the default output error filehandle and usually connected to screen.

Opening a file
To open a file ,use the open function. Syntax : open FILEHANDLE ,MODE,LIST open FILEHANDLE ,EXPR open FILEHANDLE The open function takes a filename and creates the handle for it.

File Handling (Contd..)


Opening a file (Contd..)
The open function returns a true(nonzero) value if successful otherwise it returns undefined value. The filehandle will create in either case but if the call to open fails, the filehandle will be unopened and unassigned. If the open fails the reason is stored in special variable $! ,which produces a message in string context. File handling is most error prone ,so use open and die together. Ex: open (HANDLE, $filename) or die "Can't open $filename: $!\n";

File Handling (Contd..)


Opening a file (Contd..)
open understands total six modes. MODE Read Symbol < Description Open file handle for read access only. Ex :open FILHND <$file; This is the default mode and so the < prefix is usually optional Open the file for write access only. Ex :open FILHND >$file; If the file doesnt exist then it is created and opened. If the file does exist then it overwrite the existing contents

Write

>

File Handling (Contd..)


Opening a file (Contd..)
MODE Symbol Description

Append

>>

Open the file for write access only. Ex :open FILHND >>$file; If the file doesnt exist then it is created and opened. If the does exists then it appends that file. Open the file for read and write access. Ex :open FILHND +<$file; If the file does not exist then the open fails. If the file does exist then it overwrite(contents are preserved for reading) the existing contents.

Readupdate

+<

File Handling (Contd..)


Opening a file (Contd..)
MODE Symbol Description

Writeupdate

>+

Open the file for read and write access. Ex :open FILHND >+$file; If the file doesnt exist then it is created. If the file does exist then it is truncated and its existing contents are lost.(usually used for opening a new file) Open the file for read and write access only. Ex :open FILHND >>+$file; If the file doesnt exist then it is created and opened. If the file does exist then both read and write commence from the end of the file.

Appendupdate

>>+

File Handling (Contd..)


Reading Lines
Example : exam.txt :: winners dont do different things. winners do things differently.. success is not a destination. Perl Script: #!usr/local/bin/perl open FILE, "exam.txt" or die $!; $lineno; while(<FILE>){ print $lineno++," \t"; This program displays: 0 winners dont do different things. 1 winners do things differently. 2 success is not a destination.

print "$_";

File Handling (Contd..)


< > is called readline or diamond operator. In the above example while (<FILE>) is equivalent to while(defined($_=<FILE>). Above statement will reads a line from file and assigns it to $_ and checks whether it is defined or not. If it is not defined ,probably at the end of the file so it will comes out of the loop.

File Handling (Contd..)


Perl provides a special handle called ARGV. It reads the files from the command line and opens them all if specified. It will read from Standard input(STDIN) if nothing is specified on the command line. If you don't specify anything in the angle brackets, whatever is in @ARGV is used instead.

File Handling Examples(1)


Command line passing
Example : exam.txt :: winners dont do different things. success is not a destination. Perl Script: #!usr/local/bin/perl $match =do; while(){ If(/$match/) { print FOUND\n; } else { printNOT FOUND; } } This program displays: FOUND NOT FOUND

File Handling Examples(2)


Command line passing
Example : #!/usr/local/bin/perl print "Filename: "; my $infile = <>; chomp $infile; print "New name: "; my $outfile = <>; chomp $outfile; open IN, $infile; open OUT, "> $outfile"; print OUT <IN>; close IN; # syntax :: close <filehandle> close OUT; This program displays: perl filename.pl exam.txt Filename: exam.txt New name:copy_exam.txt

#it will create this file.

File Handling Examples(3)


Command line passing
Example :if (open (LOGFILE, ">>message .log")) { print LOGFILE ("This is message number 3.\n"); print LOGFILE ("This is message number 4.\n"); close (LOGFILE); #! close function } This program displays: This is message number 1. This is message number 2 This is message number 3. This is message number 4.

print, printf, and write Functions


print function writes to the file specified, or to the current default file if no file is specified. Ex: print ("Hello, there!\n"); print OUTFILE ("Hello, there!\n"); write function uses a print format to send formatted output to the file that is specified or to the current default file. Ex : write (CD_REPORT);

Directories Handling
print function writes to the file specified, or to the current default file if no file is specified. Ex: print ("Hello, there!\n"); print OUTFILE ("Hello, there!\n"); write function uses a print format to send formatted output to the file that is specified or to the current default file. Ex : write (CD_REPORT);

Directories Handling (Contd..)


To create a new directory, call the function mkdir.

Syntax :mkdir (dirname, permissions); mkdir ("/u/public /newdir ", 0777);

Ex:

To set a directory to be the current working directory, use the function chdir. Syntax: chdir (dirname);

Ex :chdir ("/u/public/newdir");

Directories Handling (Contd..)


To open the directory (already existing) ,use the function opendir

Syntax : opendir (dirvar, dirname);

Ex: opendir (DIR, "/u/kacper/mydir"); To close an opened directory, use the closedir function Syntax: chdir (dirname);

Ex : closedir (mydir);

ThanQ

You might also like