Context Free Grammar
Context Free Grammar
Context Free Grammar
The earliest computers accepted no instructions other then their own assembly language. Every procedure, no matter how complicated , had to be encoded in the set of instructions, LOAD, STORE, ADD the contents of two registers and so on. The major problem was to display mathematical formulas as follows
2 2 2 + + (8 0) (7 10) (11 10) S= 2
or
CFG continued
1 +9 2 A= 8 5 4+ + 1 21 3+ 2
So, it was necessary to develop a way of writing such expressions in one line of standard typewriter symbols, so that in this way a high level language could be invented. Before the invention of computers, no one would ever have dreamed of writing such complicated formula in parentheses e.g. the right side of formula can be written as ((1/2)+9)/(4+(8/21)+(5/(3+(1/2))))
CFG continued
The high level language is converted into assembly language codes by a program called compiler. The compiler that takes the users programs as its inputs and prints out an equivalent program written in assembly language. Like spoken languages, high level languages for computer have also, certain grammar. But in case of computers, the grammatical rules, dont involve the meaning of the words.
CFG continued
It can be noted that the grammatical rules which involve the meaning of words are called Semantics, while those dont involve the meaning of the words are called Syntactics. e.g. in English language, it can not be written Buildings sing , while in computer language one number is as good as another. e.g. X = B + 10, X = B + 999 Following is a remark
Remark
In general, the rules of computer language grammar, are all syntactic and not semantic. A law of grammar is in reality a suggestion for possible substitutions.
CFG terminologies
Terminals: The symbols that cant be replaced by anything are called terminals. Non-Terminals: The symbols that must be replaced by other things are called nonterminals. Productions: The grammatical rules are often called productions.
CFG
CFG is a collection of the followings 1. An alphabet of letters called terminals from which the strings are formed, that will be the words of the language. 2. A set of symbols called non-terminals, one of which is S, stands for start here. 3. A finite set of productions of the form non-terminal finite string of terminals and /or non-terminals. Following is a note in this regard
Note
The terminals are designated by small letters, while the non-terminals are designated by capital letters. There is at least one production that has the non-terminal S as its left side.
S aS
Example continued
It can be observed that prod (2) generates , a can be generated applying prod. (1) once and then prod. (2), aa can be generated applying prod. (1) twice and then prod. (2) and so on. This shows that the grammar defines the language * expressed by a .
= {a} productions: 1. SSS 2. Sa 3. S This grammar also defines the language expressed by a*. Note: It is to be noted that is not considered to be terminal. It has a special status. If for a certain non-terminal N, there may be a production N. This simply means that N can be deleted when it comes in the working string.
Example
Example
= {a,b} productions: 1. SX 2. SY 3. X 4. YaY 5. YbY 6. Ya 7. Yb
Example continued
All words of this language are of either X-type or of Y-type. i.e. while generating a word the first production used is SX or SY. The words of X-type give only , while the words of Y-type are words of finite strings of as or bs or both i.e. (a+b)+. Thus the language defined is expressed by (a+b)*.
Example
= {a,b} productions: 1. SaS 2. SbS 3. Sa 4. Sb 5. S This grammar also defines the language expressed by (a+b)*.
Example
= {a,b} productions: 1. SXaaX 2. XaX 3. XbX 4. X This grammar defines the language expressed by (a+b)*aa(a+b)*.
Example
Example
Consider the following CFG = {a,b} productions: 1. S YXY 2. Y aY|bY| 3. X bbb It can be observed that, using prod.2, Y generates . Y generates a. Y generates b. Y also generates all the combinations of a and b. thus Y generates the strings generated by (a+b)*. It may also be observed that the above CFG generates the language expressed by (a+b)*bbb(a+b)*. Following are four words generated by the given CFG
Example
Example continued
S YXY aYbbb abYbbb abbbb = abbbb S YXY bbbaY bbbabY bbbabaY bbbaba = bbbaba S YXY bYbbbaY bbbbabY bbbbabbY bbbbabbaY bbbbabba = bbbbabba
Example
Consider the following CFG
1. S SS|XaXaX|
2. X bX| It can be observed that, using prod.2, X generates . X generates any number of bs. Thus X generates the strings generated by b*. It may also be observed that the above CFG generates the language expressed by (b*ab*ab*)*.
Example
Consider the following CFG = {a,b} productions: S aSa|bSb|a|b| The above CFG generates the language PALINDROME. It may be noted that the CFG S aSa|bSb|a|b generates the language NON-NULLPALINDROME.
Example
Consider the following CFG = {a,b} productions: S aSb|ab| It can be observed that the CFG generates the language {anbn: n=0,1,2,3, }. It may also be noted that the language {anbn: n=1,2,3, } can be generated by the following CFG S aSb|ab
Task
Construct CFG that generates the language L = {w {a,b}*: length(w) 2 and second letter of w from right is a}
Example
Consider the following CFG (1) S aXb|bXa (2) X aX|bX| The above CFG generates the language of strings, defined over ={a,b}, beginning and ending in different letters.
Task
Construct the CFG for the language of strings, defined over ={a,b}, beginning and ending in same letters.
Trees
As in English language any sentence can be expressed by parse tree, so any word generated by the given CFG can also be expressed by the parse tree, e.g. consider the following CFG S AA A AAA|bA|Ab|a Obviously, baab can be generated by the above CFG. To express the word baab as a parse tree, start with S. Replace S by the string AA, of nonterminals, drawing the downward lines from S to each character of this string as follows
Trees continued
S A
Now let the left A be replaced by bA and the right one by Ab then the tree will be
S
A
b AA
A b
Trees continued
Replacing both As by a, the above tree will be
S
A b A
AA
a a
Trees continued
Thus the word baab is generated. The above tree to generate the word baab is called Syntax tree or Generation tree or Derivation tree as well.
Turing machine
The mathematical models (FAs, TGs, PDAs) that have been discussed so far can decide whether a string is accepted or not by them i.e. these models are language identifiers. However, there are still some languages which cant be accepted by them e.g. there does not exist any FA or TG or PDA accepting any nonCFLs. Alan Mathison Turing developed the machines called Turing machines, which accept some non-CFLs as well, in addition to CFLs.
Turing machine
Definition: A Turing machine (TM) consists of the following 1. An alphabet of input letters. 2. An input TAPE partitioned into cells, having infinite many locations in one direction. The input string is placed on the TAPE starting its first letter on the cell i, the rest of the TAPE is initially filled with blanks ( s).
iv
...
TAPE Head
3. A tape Head can read the contents of cell on the TAPE in one step. It can replace the character at any cell and can reposition itself to the next cell to the right or to the left of that it has just read.
Following is a note
Note
It may be noted that there may not be any outgoing edge at certain state for certain letter to be read from the TAPE, which creates nondeterminism in Turing machines. It may also be noted that at certain state, there cant be more than one out going edges for certain letter to be read from the TAPE. The machine crashes if there is not path for a letter to be read from the TAPE and the corresponding string is supposed to be rejected.
Note continued
To terminate execution of certain input string successfully, a HALT state must be entered and the corresponding string is supposed to be accepted by the TM. The machine also crashes when the TAPE Head is instructed to move one cell to the left of cell i. Following is an example of TM
Example
Consider the following Turing machine
(a,a,R) (b,b,R) (a,a,R) (b,b,R) 1 START (b,b,R) 2 3 (,,R) 4 HALT
Example continued
Input TAPE
i a ii b iii a iv
...
TAPE Head
Starting from the START state, reading a form the TAPE and according to the TM program, a will be printed i.e. a will be replaced by a and the TAPE Head will be moved one cell to the right.
...
1 aba
2 aba
HALT
Which shows that the string aba is accepted by this machine. It can be observed, from the program of the TM, that the machine accepts the language expressed by (a+b)b(a+b)*.
Theorem: Every regular language is accepted by some TM. Example: Consider the EVEN-EVEN language. Following is a TM accepting the EVEN-EVEN language.
(b,b,R)
5 HALT (,,R) 1 START (b,b,R) (a,a,R) (a,a,R) (b,b,R) 3 (b,b,R) 4 2
(a,a,R)
(a,a,R)
It may be noted that the above diagram is similar to that of FA corresponding to EVEN-EVEN language. Following is another example
Example
Consider the following TM
(a,a,R) 2 (b,b,R) (b,b,R) 3 (a,a,L) 4
(a,a,L) (b,b,L)
(*,*,R) 8 (a,,L)
Example continued
The string aaabbbaaa can be observed to be accepted by the above TM. It can also be observed that the above TM accepts the non-CFL {anbnan}.
INSERT subprogram
Sometimes, a character is required to be inserted on the TAPE exactly at the spot where the TAPE Head is pointing, so that the character occupies the required cell and the other characters on the TAPE are moved one cell right. The characters to the left of the pointed cell are also required to remain as such.
In the situation stated above, the part of TM program that executes the process of insertion does not affect the function that the TM is performing. The subprogram of insertion is independent and can be incorporated at any time with any TM program specifying what character to be inserted at what location. The subprogram of insertion can be expressed as
INSERT a
INSERT b
INSERT #
The above diagrams show that the characters a,b and # are to be inserted, respectively. Following is an example showing how does the subprogram INSERT perform its function
Example
If the letter b is inserted at the cell where the TAPE Head is pointing as shown below
... b X a b b X ...
then, it is expressed as
... b X a b b X ...
INSERT b
The function of subprogram INSERT b can be observed from the following diagram
... b X b a b b X ...
(a,a,R)
(, b,R)
4 (X,X,R)
7
Out
It is supposed that machine is at state 1, when b is to be inserted. All three possibilities of reading a, b or X are considered by introducing the states 2,3 and 4 respectively. These states remember what letter displaced during the insertion of Q. Consider the same location where b is to be inserted
...
b X a
b X ...
After reading a from the TAPE, the program replaces a by Q and the TAPE Head will be moved one step right. Here the state 2 is entered. Reading b at state 2, b will be replaced by a and state 3 will be entered. At state 3 b is read which is not replaced by any character and the state 3 will not be left.
At state 3, the next letter to be read is X, which will be replaced by b and the state 4 will be entered. At state 4, will be read, which will be replaced by X and state 5 will be entered. At state 5 will be read and without any change state 6 will be entered, while TAPE Head will be moved one step left. The state 6 makes no change whatever (except Q) is read at that state. However at each step, the TAPE Head is moved one step left. Finally, Q is read which is replaced by b and the TAPE Head is moved to one step right.
DELETE subprogram
Sometimes, a character is required to be DELETED on the TAPE exactly at the spot where the TAPE Head is pointing, so that the other characters on the right of the TAPE Head are moved one cell left. The characters to the left of the pointed cell are also required to remain as such.
In the situation stated above, the part of TM program that executes the process of deletion does not affect the function that the TM is performing. The subprogram of deletion is independent and can be incorporated at any time with any TM program specifying what character to be deleted at what location. The subprogram of deletion can be expressed as
Example
If the letter a is to be deleted from the string bcabbc, shown below
... b c a b b c ...
then, it is expressed as
... b c a b b c ...
DELETE
The function of subprogram DELETE can be observed from the following diagram
... b c b b c . ..
In
(b, ,L)
(a,a,L)
(c,,L) (c,c,L)
(b,a,L)
(a,b,L) (a,c,L)
(b,b,L)
5 (c,b,L) (b,c,L)
(,a,R) (c,a,L)
(,b,R)
(,c,R)
Out 7
The process of deletion of letter a from the string bcabbc can easily be checked, giving the TAPE situation as shown below
... b c b b c . ..