C Tutorials
C Tutorials
C Tutorials
https://embetronicx.com/c-tutorials
In this post, I’ll walk through each one of the four stages of compiling stages using the following C
program:
Preprocessing
The first stage of compilation is called preprocessing. In this stage, lines starting with
a # character are interpreted by the preprocessor as preprocessor commands. Before
interpreting the commands, the preprocessor does some initial processing. This includes
joining continued lines (lines ending with a \) and stripping comments.
Given the “Hello, World!” example above, the preprocessor will produce the contents of
the stdio.h header file joined with the contents of the hello_world.c file, stripped free from
its leading comment:
Compilation
The second stage of compilation is confusingly enough called compilation. In this stage, the
preprocessed code is translated to assembly instructions specific to the target processor
architecture. These form an intermediate human-readable language.
The existence of this step allows for C code to contain inline assembly instructions and for
different assemblers to be used.
Some compilers also support the use of an integrated assembler, in which the compilation stage
generates machine code directly, avoiding the overhead of generating the intermediate assembly
instructions and invoking the assembler.
cc -S hello_world.c
This will create a file named hello_world.s, containing the generated assembly instructions. On Mac
OS 10.10.4, where cc is an alias for clang, the following output is generated:
Assembly
During the assembly stage, an assembler is used to translate the assembly instructions to machine
code, or object code. The output consists of actual instructions to be run by the target processor.
cc -c hello_world.c
Running the above command will create a file named hello_world.o, containing the object code of
the program. The contents of this file are in a binary format and can be inspected
using hexdump or od by running either one of the following commands:
Linking
The object code generated in the assembly stage is composed of machine instructions that the
processor understands but some pieces of the program are out of order or missing. To produce an
executable program, the existing pieces have to be rearranged and the missing ones filled in. This
process is called linking.
Let’s assume your project contains two ( hello.c and world.c) source files. So, when you compile the
project, the assembler will give you hello.o and world.o object files. In the end, we need only one
binary file that has to be loaded into the target processor or controller. So, the linker will arrange
those hello.o and world.o, gives a single binary file.
That’s why you will get a error from the linker when you are calling the function which is not defined
anywhere. Linker tries to find that function in other source files and throws an error if it couldn’t find
that.
The linker will arrange the pieces of object code so that functions in some pieces can successfully call
functions in other pieces. It will also add pieces containing the instructions for library functions used
by the program. In the case of the “Hello, World!” program, the linker will add the object code for
the puts function.
The result of this stage is the final executable program. When run without options, cc will name this
file a.out. To name the file something else, pass the -ooption to cc:
cc -o hello_world hello_world.c
When you run any C program, its executable image is loaded into the RAM of the computer in an
organized manner which is called the process address space or Memory layout of the C program.
This memory layout is organized in the following fashion:
5. Heap Segment
Code segment, also known as text segment contains machine code of the compiled program. The
text segment of an executable object file is often a read-only segment that prevents a program from
being accidentally modified. So this memory contains .bin or .exe or .hex etc.
As a memory region, a text segment may be placed below the heap or stack in order to prevent
heaps and stack overflows from overwriting it.
Data Segments
Data segment stores program data. This data could be in form of initialized or uninitialized variables,
and it could be local or global. The data segment is further divided into four sub-data segments
(initialized data segment, uninitialized or .bss data segment, stack, and heap) to store variables
depending upon if they are local or global, and initialized or uninitialized.
Initialized data or simply data segment stores all global, static, constant, and external variables
(declared with extern keyword) that are initialized beforehand.
Note that, the data segment is not read-only, since the values of the variables can be changed at run
time.
This segment can be further classified into the initialized read-only area and initialized read-write
area.
All global, static, and external variables are stored in initialized read-write memory except the const
variable.
int main()
{
The uninitialized data segment is also called as BSS segment. BSS stands for ‘ Block Started by
Symbol’ named after an ancient assembler operator. The uninitialized data segment contains all
global and static variables that are initialized to zero or do not have explicit initialization in the
source code.
int main()
{
}
Stack Segment
Stack is the place where automatic variables are stored, along with information that is saved each
time a function is called. Each time a function is called, the address of where to return to and certain
information about the caller’s environment, such as some of the machine registers, are saved on
the stack. The newly called function then allocates room on the stack for its automatic and
temporary variables. This is how recursive functions in C can work. Each time a recursive function
calls itself, a new stack frame is used, so one set of variables doesn’t interfere with the variables
from another instance of the function.
So, the Stack frame contains some data like return address, arguments passed to it, local variables,
and any other information needed by the invoked function.
A “stack pointer (SP)” keeps track of the stack by each push & pop operation onto it, by adjusting
the stack pointer to the next or previous address.
The stack area traditionally adjoined the heap area and grew in the opposite direction; when the
stack pointer met the heap pointer, free memory was exhausted. (With modern large address spaces
and virtual memory techniques they may be placed almost anywhere, but they still typically grow in
opposite directions.)
If you want to learn the stack implementation, you can refer to this article.
Heap Segment
Heap is the segment where dynamic memory allocation usually takes place.
The heap area begins at the end of the BSS segment and grows to larger addresses from there.
The Heap area is managed by malloc, realloc, and free, which may use the brk and sbrk system calls
to adjust its size (note that the use of brk/sbrk and a single “heap area” is not required to fulfill the
contract of malloc/realloc/free; they may also be implemented using mmap to reserve potentially
non-contiguous regions of virtual memory into the process’ virtual address space). The Heap area is
shared by all shared libraries and dynamically loaded modules in a process.
Unmapped or reserved segments contain command-line arguments and other program-related data
like the lower address-higher address of the executable image, etc.
Just see the below example. I will tell you the Memory layout using a practical example.
Example
Step 1:
int main(void)
{
return 0;
}
Step 2:
Let us add one global variable in the program, now check the size of bss.
#include <stdio.h>
int main(void)
{
return 0;
}
Step 3:
#include <stdio.h>
int main(void)
{
static int i; //Uninitialized static variable stored in bss
return 0;
}
Compile and Check.
Step 4:
Let us initialize the static variable to non-zero which will then be stored in Initialized Data Segment
(DS).
#include <stdio.h>
int main(void)
{
static int i = 10; //Initialized static variable stored in Initialized
Data Segment
return 0;
}
Step 5:
Let us initialize the global variable to non-zero which will then be stored in Initialized Data Segment
(DS).
#include <stdio.h>
int main(void)
{
static int i = 10; //Initialized static variable stored in Initialized
Data Segment
return 0;
}
Compile and Check
Enum (Enumeration) in C
Introduction
Enumerated Types are a special way of creating your own Type in C. The type is a “list of keywords”.
Enumerated types are used to make a program clearer to the reader/maintainer of the program. For
example, say we want to write a program that checks for keyboard presses to find if the down arrow
or up arrow has been pressed. We could say: if( press_value == 32 ). 32 is the computer’s
representation of the down arrow. Or, we could create our own enumerated type with the
keywords: down_arrow and up_arrow. Then we could say: if( press_value ==down_arrow ). This
second version is much more readable and understandable to the programmer.
Enumerated Types
Enumerated Types allow us to create our own symbolic names for a list of related ideas. The
keyword for an enumerated type is enum. For example, we could create an enumerated type for
true and false (note: this is done for you by C and is type bool).
enum Boolean
{
false,
true
};
We could also create an enumerated type to represent various security levels so that our program
could run a door card reader:
enum Security_Levels
{
black_ops,
top_secret,
secret,
non_secret
}; // don't forget the semi-colon ;
These enumerated types can be used like any other type in a program. The type name “Boolean” or
“Security_Levels” can be used to define variables or the return type of functions. The actual
enumerations can be used directly in the code. For example:
In Essence, Enumerated types provide a symbolic name to represent one state out of a list of states.
It bears repeating: Even though enumerated type values look like strings, they are new keywords
that we define for our program. Once defined the computer can process them directly. There is no
need to use strcmp with enumerated types.
// CORRECT:
// INCORRECT:
if ( ! strcmp(my_security_level,secret) ) ...
// ALSO INCORRECT:
if ( ! strcmp(my_security_level,"secret") ) ...
It turns out that enumerated types are treated like integers by the compiler. Underneath they have
numbers 0,1,2,… etc. You should never rely on this fact, but it does come in handy for certain
applications.
For example, our security_levels enum has the following values:
black_ops = 0
top_secret = 1
secret = 2
non_secret = 3
Note: One of the shortcomings of Enumerated Types is that they don’t print nicely. To print the
“String” associated with the defined enumerated value, you must use the following cumbersome
code:
if ( my_security_level == secret ) {
printf("secret");
} else if ( my_security_level == top_secret ) {
printf("top_secret");
}
...
The following “trick” to print enums (in a more compact notation) is made possible because of the
following two facts:
The trick is to create an array of strings and use the enums to index the array. Here is an example of
how it is done:
Storage Class in C
Before moving ahead, let’s quickly understand the difference between the lifetime and the scope of
a variable.
Scope of variable
In which function the value of the variable would be available. Basically, this is the visibility of the
variable.
Life of the variable
Scope of variables
Scope Meaning
Starts at the beginning of the file (also called a translation unit) and ends at the
end of the file. It refers only to those identifiers that are declared outside of all
File scope functions. File scope identifiers are visible throughout the entire file. Variables
that have file scope are global.
Begins with the opening { of a block and ends with its associated closing }.
However, block scope also extends to function parameters in a function definition.
Block scope That is, function parameters are included in a function’s block scope. Variables
with block scope are local to their block.
Function
prototype Identifiers declared in a function prototype. visible within the prototype.
scope
Begins with the opening { of a function and ends with its closing }. Function scope
Function applies only to labels. A label is used as the target of a goto statement, and that
scope label must be within the same function as the goto.
Storage Classes in C
In the C language, the lifetime and scope of a variable are defined by its Storage Classes in C.
Automatic (auto)
Register
External (extern)
Static
Auto
Features:
Storage Memory
Scope Local / Block Scope
Lifetime Exists as long as Control remains in the block
Default initial Value Garbage
The auto storage class in C is the default storage class for all local variables.
{
int mount;
auto int month;
}
The example above defines two variables within the same storage class. By default, all the local
variables are auto. ‘auto‘ can only be used within functions, i.e., local variables.
Register
Features:
A value stored in a CPU register can always be accessed faster than the one that is stored in memory.
Therefore, if a variable is used at many places in a program, it is better to declare its storage class
as register.
There are no guarantees that we have declared any variable as register and it would be stored in the
CPU register! Why? The reason is that CPU registers are limited, and they may be busy doing some
other task. In that case, that variable works as default storage classes in C i.e. automatic storage
class.
Note: Any variable stored in the CPU register or not depends on the capacity of the microprocessor.
For example, if the microprocessor has a 16-bit register then it cannot hold a float value or a double
value, which require 4 and 8 bytes respectively. However, if you use the register storage class for
float, double variable then you won’t get any error messages because the compiler treats it as the
default storage class i.e. auto storage class.
All looping programs where a variable is frequently used, declare a variable as register.
Extern
Features:
Storage Memory
Scope Global / File Scope
As long as the program’s execution
Lifetime
doesn’t come to an end.
Default initial Value Zero
The extern specifier gives the declared variable external storage class. The principal use of extern is
to specify that a variable is declared with external linkage elsewhere in the program. To understand
why this is important, it is necessary to understand the difference between a declaration and a
definition. A declaration declares the name and type of a variable or function. A definition causes
storage to be allocated for the variable or the body of the function to be defined. The same variable
or function may have many declarations, but there can be only one definition for that variable or
function.
When an extern specifier is used with a variable declaration then no storage is allocated to that
variable and it is assumed that the variable has already been defined elsewhere in the program.
When we use the extern specifier the variable cannot be initialized because with the extern specifier
variable is declared, not defined.
In the following sample C program, if you remove extern int x; you will get an error “Undeclared
identifier ‘x’” because variable x is defined later than it has been used in printf. In this example,
the extern specifier tells the compiler that variable x has already been defined and it is declared here
for the compiler’s information.
#include <stdio.h>
extern int x;
int main()
{
printf("x: %d\n", x);
}
int x=10;
Also, if you change the statement extern int x; to extern int x = 50; you will again get an error
“Redefinition of ‘x’” because with extern specifier the variable cannot be initialized if it is defined
elsewhere. If not then extern declaration becomes a definition.
Mostly this extern keyword will be used when we want to share the global variable in two or
more .c files.
Note that extern can also be applied to a function declaration, but doing so is redundant because all
function declarations are implicitly extern.
Static Variables
Features:
Storage Memory
Scope Block Scope
Value of the variable persists between different
Lifetime
function calls
Default initial Value Zero
Static variables are those variables whose lifetime remains equal to the lifetime of the program like
global variables. Any local or global variable can be made static depending upon what the logic
expects out of that variable. Let’s consider the following example:
#include<stdio.h>
char** func_Str();
int main(void)
{
char **ptr = NULL;
ptr = func_Str();
printf("\n [%s] \n",*ptr);
return 0;
}
char** func_Str()
{
char *p = "Linux";
return &p;
}
In the code above, the function ‘func_str()’ returns the address of the pointer ‘p’ to the calling
function which uses it further to print the string ‘Linux’ to the user through ‘printf()’. Let’s look at
the output:
$ ./static
[Linux]
$
The output above is as expected. So, is everything fine here? Well, there is a hidden problem in the
code. More specifically, its the return value of the local character pointer (char *p) in the function
‘func_Str()’. The value being returned is the address of the local pointer variable ‘ p’. Since ‘p’ is local
to the function, so as soon as the function returns, the lifetime of this variable is over and hence its
memory location becomes free for other operations.
#include<stdio.h>
char** func1_Str();
char** func2_Str();
int main(void)
{
char **ptr1 = NULL;
char **ptr2 = NULL;
ptr1 = func1_Str();
printf("\n [%s] \n",*ptr1);
ptr2 = func2_Str();
printf("\n [%s] \n",*ptr2);
return 0;
}
char** func1_Str()
{
char *p = "Linux";
return &p;
}
char** func2_Str()
{
char *p = "Windows";
return &p;
}
In the code above, now there are two functions ‘func1_Str()’ and ‘func2_Str()’. The logical problem
remains the same here too. Each of these functions returns the address of its local variable. In
the main() function, the address returned by the func1_Str() is used to print the string ‘Linux’ (as
pointed by its local pointer variable) and the address returned by the function func2_Str() is used to
print the string ‘Windows’ (as pointed by its local pointer variable). An extra step towards the end of
the main() function is done by again using the address returned by func1_Str() to print the string
‘Linux’.
$ ./static
[Linux]
[Windows]
[Windows]
The output above is not as per expectations. The third print should have been ‘Linux’ instead of
‘Windows’. Well, I’d rather say that the above output was expected. It’s just the correct scenario
that exposed the loophole in the code.
Let’s go a bit deeper to see what happened after the address of the local variable was returned. See
the code below:
#include<stdio.h>
char** func1_Str();
char** func2_Str();
int main(void)
{
char **ptr1 = NULL;
char **ptr2 = NULL;
ptr1 = func1_Str();
printf("\n [%s] :: func1_Str() address = [%p], its returned address is
[%p]\n",*ptr1,(void*)func1_Str,(void*)ptr1);
ptr2 = func2_Str();
printf("\n [%s] :: func2_Str()address = [%p], its returned address is
[%p]\n",*ptr2,(void*)func2_Str,(void*)ptr2);
printf("\n [%s] [%p]\n",*ptr1,(void*)ptr1);
return 0;
}
char** func1_Str()
{
char *p = "Linux";
return &p;
}
char** func2_Str()
{
char *p = "Windows";
return &p;
}
The code is above is modified to print the address of the functions and the address of their
respective local pointer variables. Here is the output:
$ ./static
[Linux] :: func1_Str() address = [0x4005d5], its returned address is
[0x7fff705e9378]
[Windows] :: func2_Str()address = [0x4005e7], its returned address is
[0x7fff705e9378]
[Windows] [0x7fff705e9378]
The above output makes it clear that once the lifetime of the local variable of the function
‘func1_Str()’ gets over then same memory address is being used for the local pointer variable of the
function ‘func2_Str()’ and hence the third print is ‘Windows’ and not ‘Linux’.
So, now we see the root of the problem is the lifetime of the pointer variables. This is where the
‘static’ storage class comes to rescue. As already discussed the static Storage classes make the
lifetime of a variable equal to that of the program. So, let’s make the local pointer variables
as static and then see the output:
#include<stdio.h>
char** func1_Str();
char** func2_Str();
int main(void)
{
char **ptr1 = NULL;
char **ptr2 = NULL;
ptr1 = func1_Str();
printf("\n [%s] :: func1_Str() address = [%p], its returned address is
[%p]\n",*ptr1,(void*)func1_Str,(void*)ptr1);
ptr2 = func2_Str();
printf("\n [%s] :: func2_Str()address = [%p], its returned address is
[%p]\n",*ptr2,(void*)func2_Str,(void*)ptr2);
printf("\n [%s] [%p]\n",*ptr1,(void*)ptr1);
return 0;
}
char** func1_Str()
{
static char *p = "Linux";
return &p;
}
char** func2_Str()
{
static char *p = "Windows";
return &p;
}
Note that in the code above, the pointers were made static. Here is the output:
$ ./static
[Linux] :: func1_Str() address = [0x4005d5], its returned address is
[0x601028]
[Windows] :: func2_Str()address = [0x4005e0], its returned address is
[0x601020]
[Linux] [0x601028]
So, we see that after making the variables as static, the lifetime of the variables becomes equal to
that of the program.
Impact on Scope
In the case where code is spread over multiple files, the static storage type can be used to limit the
scope of a variable to a particular file. For example, if we have a variable ‘count’ in one file and we
want to have another variable with the same name in some other file, then, in that case, one of the
variables has to be made as static. The following example illustrates it :
static.c
#include<stdio.h>
int count = 1;
int main(void)
{
printf("\n count = [%d]\n",count);
return 0;
}
static_1.c
#include<stdio.h>
int count = 4;
int func(void)
{
printf("\n count = [%d]\n",count);
return 0;
}
Now, when both the files are compiled and linked together to form a single executable, here is the
error that is thrown by GCC:
So, we see that GCC complains of multiple declarations of the variable ‘count’.
static.c
#include<stdio.h>
int main(void)
{
printf("\n count = [%d]\n",count);
return 0;
}
static_1.c
#include<stdio.h>
int count = 4;
int func(void)
{
printf("\n count = [%d]\n",count);
return 0;
}
So, we see that no error is thrown this time because static has limited the scope of the variable
‘count’ in file static.c to the file itself.
Static Functions
By default any function that is defined in a C file is extern. This means that the function can be used
in any other source file of the same code/project (which gets compiled as a separate translational
unit). Now, if there is a situation where the access to a function is to be limited to the file in which it
is defined or if a function with the same name is desired in some other file of the same code/project
then the functions in C can be made static.
Extending the same example that was used in the previous section, suppose we have two files :
static.c
#include<stdio.h>
void func();
int main(void)
{
func();
return 0;
}
void funcNew()
{
printf("\n Hi, I am a normal function\n");
}
static_1.c
#include<stdio.h>
void funcNew();
int func(void)
{
funcNew();
return 0;
}
So, we see that the function funcNew() was defined in one file and successfully got called from the
other. Now, if the file static_1.c wants to have its own funcNew(), ie :
static_1.c
#include<stdio.h>
void funcNew();
int func(void)
{
funcNew();
return 0;
}
void funcNew()
{
printf("\n Hi, I am a normal function\n");
}
So, we see that the compiler complains of multiple definitions of the function funcNew(). So, we
made the funcNew() in static_1.c as static :
static_1.c
#include<stdio.h>
int func(void)
{
funcNew();
return 0;
}
Final Comparison
Macros are generally used to define constant values that are being used repeatedly in programs.
Macros can even accept arguments and such macros are known as function-like macros. It can be
useful if tokens are concatenated into code to simplify some complex declarations. Macros provide
text replacement functionality at pre-processing time by the preprocessor.
Simple Macro
#define MAX_SIZE 10
Now let’s see an example through which we will confirm that macros are replaced by their values at
pre-processing time.
Here is a C program :
#include<stdio.h>
#define MAX_SIZE 10
int main(void)
{
int size = 0;
size = size + MAX_SIZE;
printf("\n The value of size is [%d]\n",size);
return 0;
}
Now let’s compile it with the flag -save-temps so that pre-processing output (a file with an
extension .i ) is produced along with the final executable using the below command.
The command above will produce all the intermediate files in the GCC compilation process. One of
these files will be macro.i. This is the file of our interest. If you open this file and get to the bottom of
this file:
...
...
...
int main(void)
{
int size = 0;
size = size + 10;
printf("\n The value of size is [%d]\n",size);
return 0;
}
So, you see that the macro MAX_SIZE was replaced with its value (10) in preprocessing stage of the
compilation process.
Macros are handled by the pre-compiler and are thus guaranteed to be inlined. Macros are used for
short operations and it avoids function call overhead. It can be used if any short operation is being
done in the program repeatedly. Function-like macros are very beneficial when the same block of
code needs to be executed multiple times.
Example
Here are some examples that define macros for swapping numbers, square of numbers, logging
function, etc.
#include <stdio.h>
int main()
{
int i=1;
TRACE_LOG("%s", "Sample macro\n");
TRACE_LOG("%d %s", i, "Sample macro\n");
return 0;
}
$ ./macro2
Sample macro
1 Sample macro
C Conditional Macros
Conditional macros are very useful to apply conditions. Code snippets are guarded with a condition
checking if a certain macro is defined or not. They are very helpful in large projects having code
segregated as per releases of the project. If some part of code needs to be executed for release 1 of
the project and some other part of code needs to be executed for release 2, then it can be easily
achieved through conditional macros.
#ifdef PRJ_REL_01
..
.. code of REL 01 ..
..
#else
..
.. code of REL 02 ..
..
#endif
To comment multiples lines of code, the macro is used commonly like below :
#if 0
..
.. code to be commented ..
..
#endif
Here, we will understand the above features of macro through the program that is given below.
Example
#include <stdio.h>
int main()
{
#if 0
printf("commented code 1");
printf("commented code 2");
#endif
#define TEST1 1
#ifdef TEST1
printf("MACRO TEST1 is defined\n");
#endif
#ifdef TEST3
printf("MACRO TEST3 is defined\n");
#else
printf("MACRO TEST3 is NOT defined\n");
#endif
return 0;
}
Output:
$ ./macro
MACRO TEST1 is defined
MACRO TEST3 is NOT defined
Here, we can see that “commented code 1”, “commented code 2” are not printed because these
lines of code are commented under #if 0 macro. And, TEST1 macro is defined so, the string “MACRO
TEST1 is defined” is printed and since macro TEST3 is not defined, so “MACRO TEST3 is defined” is
not printed.
Inline functions are those functions whose definition is small and can be substituted at the place
where its function call is made. Basically, they are inlined with its function call. Even there is no
guarantee that the function will actually be inlined. The compiler interprets the inline keyword as a
mere hint or requests to substitute the code of function into its function call. Usually, people say
that having an inline function increases performance by saving time of function call overhead (i.e.
passing arguments variables, return address, return value, stack mantle and its dismantle, etc.) but
whether an inline function serves your purpose in a positive or in a negative way depends purely on
your code design and is largely debatable. The compiler does inlining for performing optimizations. If
compiler optimization has been disabled, then inline functions would not serve their purpose and
their function call would not be replaced by their function definition. To have GCC inline your
function regardless of optimization level, declare the function with the “always_inline” attribute:
Example
#include <stdio.h>
int main()
{
int tmp;
test_inline_func1(2,4);
tmp = test_inline_func2(5);
printf("square val=%d\n", tmp);
return 0;
}
Output:
$ ./inline
a=2 and b=4
square val=25
Now, we will understand how inline functions are defined. It is very simple. Only, we need to specify
“inline” keyword in its definition. Once you specify “inline” keyword in its definition, it requests the
compiler to do optimizations for this function to save time by avoiding function call overhead.
Whenever calling to the inline function is made, the function call would be replaced by the definition
of the inline function.
Since they are functions so the type of arguments is checked by the compiler whether they
are correct or not.
There is no risk if called multiple times. But there is risk in macros which can be dangerous
when the argument is an expression.
They can include multiple lines of code without trailing backslashes.
Inline functions have their own scope for variables and they can return a value.
It is a common misconception that inlining always equals faster code. If there are many lines in the
inline function or there are more function calls, then inlining can cause wastage of space.
Remember, inlining is only a request to the compiler, not a command. The compiler can ignore the
request for inlining. The compiler may not perform inlining in such circumstances like :
3. If a function is recursive.
5. If a function contains switch or goto statement.
Pointers in C – Part 1
Introduction
The concept of pointers is one of the most powerful fundamentals of the C/C++ language.
Through pointers, a developer can directly access memory from his/her code which makes
memory related operations very fast. But, as always, with great power comes great
responsibility. A developer has to very carefully make use of pointers in order to avoid some
problems that can be a nightmare to debug.
Different from other normal variables which can store values, pointers are special variables that
can hold the address of a variable. Since they store the memory address of a variable, the
pointers are very commonly said to “point to variables”. Let’s try to understand the concept.
As shown in the above diagram:
A normal variable ‘var’ has a memory address of 1001 and holds a value of 50.
A pointer variable has its own address 2047 but stores 1001, which is the address of the
variable ‘var’.
A pointer is declared as :
1. Pointer-type: It specifies the type of pointer. It can be int, char, float, etc. This type specifies
the type of variable whose address this pointer can store.
2. Pointer-name: It can be any name specified by the user. An example of a pointer declaration
can be :
char *chptr;
In the above declaration, ‘char’ signifies the pointer type, chptr is the name of the pointer while
the asterisk ‘*’ signifies that ‘chptr’ is a pointer variable.
(OR)
<pointer declaration>
<name-of-pointer> = <address of a variable>
Note that the type of variable above should be the same as the pointer type. (Though this is not
a strict rule but this should be kept in mind). For example:
char ch = 'c';
char *chptr = &ch; //initialize
OR
char ch = 'c';
char *chptr;
chptr = &ch //initialize
In the code above, we declared a character variable ch which stores the value ‘c’. Now, we
declared a character pointer ‘chptr’ and initialized it with the address of variable ‘ch’. Note that
the ‘&’ operator is used to access the address of any type of variable.
Context 1: For accessing the address of the variable whose memory address the pointer stores.
Again consider the following code :
char ch = 'c';
char *chptr = &ch;
Now, whenever we refer to the name ‘chptr’ in the code after the above two lines, then the
compiler would try to fetch the value contained by this pointer variable, which is the address of
the variable (ch) to which the pointer points. i.e. the value is given by ‘chptr’ would be equal to
‘&ch’.
For example:
The value held by ‘chptr’ (which in this case is the address of the variable ‘ch’) is assigned to the
new pointer ‘ptr’.
Context 2: For accessing the value of the variable whose memory addresses the pointer stores.
char ch = 'c';
char t;
char *chptr = &ch;
t = *chptr;
We see that in the last line above, we have used ‘*’ before the name of the pointer. What does
this asterisk operator do? Well, this operator when applied to a pointer variable name(like in
the last line above) yields the value of the variable to which this pointer points. This means, in
this case, ‘*chptr’ would yield the value kept at the address held by chptr. Since ‘chptr’ holds
the address of variable ‘ch’ and value of ‘ch’ is ‘c’, so ‘*chptr’ yeilds ‘c’.
When used with pointers, the asterisk ‘*’ operator is also known as ‘value of’ operator. An
Example of C Pointers Consider the following code :
#include <stdio.h>
int main(void)
{
char ch = 'c';
char *chptr = &ch;
int i = 20;
int *intptr = &i;
float f = 1.20000;
float *fptr = &f;
char *ptr = "I am a string";
printf("\n [%c], [%d], [%f], [%c], [%s]\n", *chptr, *intptr, *fptr,
*ptr, ptr);
return 0;
}
OUTPUT :
$ ./pointers
To debug a C program, use gdb. The above code covers all the common pointers. The first three
of them are very trivial to understand. So let’s concentrate on the fourth one. In the fourth
example, a character pointer points to a string. In C, a string is nothing but an array of
characters. So, we have no staring pointers in C. It’s the character pointers that are used in the
case of strings too.
Now, coming to the string, when we point a pointer to a string, by default it holds the address of
the first character of the string. Let’s try to understand it better.
1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012
I a m S t r i n g \0
Since characters occupy one byte each, so they are placed like above in the memory. Note the
last character, it’s a null character that is placed at the end of every string by default in C. This
null character signifies the end of the string.
Now coming back to the point, any character pointer pointing to a string, stores the address of
the first character of the string. In the code above, ‘ptr’ holds the address of the character ‘I’
ie 1001. Now, when we apply the ‘value of’ operator ‘*’ to ‘ptr’, we intend to fetch the value at
address 1001 which is ‘I’ and hence when we print ‘*ptr’, we get ‘I’ as the output. Also, If we
specify the format specifier as ‘%s’ and use ‘ptr’ (which contains the starting address of the
string), then the complete string is printed using printf. The concept is that %s specifier
requires the address of the beginning byte of string to display the complete string, which we
provided using ‘ptr’ (which we know holds the beginning byte address of the string). This we
can see as the last print in the output above.
#include<stdio.h>
struct st {
int a;
char ch;
};
int main(void)
{
struct st obj;
struct st *stobj = &obj;
stobj->a = 5;
stobj->ch = 'a';
OUTPUT:
$ ./pointers
[5] [a]
In the above code, we have declared a pointer stobj of type ‘struct st’. Now, since the pointer
type is structure, the address it points to has to be of a ‘struct st’ type variable (which in this
case is ‘obj’). Another interesting part is how structure elements are accessed using pointer
variable ‘stobj’. Yes, When dealing with pointer objects, it’s a standard to use the arrow
operator -> instead of ‘.’ operator(which would have been used, had we used ‘obj’ to access the
structure elements).
Pointers in C – Part 2
Pointers Advanced concepts
Now, we will try to develop an understanding of some of the relatively complex concepts. The
following are explained in this article with examples:
As a developer, you should understand the difference between a constant pointer and a pointer
to a constant.
C Constant pointer
A pointer is said to be a constant pointer when the address that is pointing to, cannot be
changed.
Let’s take an example :
char ch, c;
char *ptr = &ch;
ptr = &c;
In the above example, we defined two characters (‘ch’ and ‘c’) and a character pointer ‘ptr’.
First, the pointer ‘ptr’ contained the address of ‘ch’ and in the next line, it contained the address
of ‘c’. In other words, we can say that Initially ‘ptr’ pointed to ‘ch’ and then it pointed to ‘c’.
But in the case of a constant pointer, once a pointer holds an address, that address cannot be
changed. That means a constant pointer cannot point to a new address if it is already pointing to
an address.
If we see the example above, if we declare ‘ptr’ as a constant pointer, then the third line would
have not been valid.
For example:
#include<stdio.h>
int main(void)
{
char ch = 'c';
char c = 'a';
return 0;
}
When the code above is compiled, the compiler gives the following error:
So as expected, the compiler throws an error when we try to change the address held by the
constant pointer.
C Pointer to Constant
This concept is easy to understand as the name simplifies the concept. Yes, as the name itself
suggests, this type of pointer cannot change the value at the address pointed by it.
char ch = 'c';
char *ptr = &ch;
*ptr = 'a';
In the above example, we used a character pointer ‘ptr’ that points to the character ‘ch’. In the
last line, we change the value at the address pointer by ‘ptr’. But if ‘ptr‘ would have been a
pointer to a constant, then the last line would have been invalid because a pointer to a constant
cannot change the value at the address it is pointing to.
For example:
#include<stdio.h>
int main(void)
{
char ch = 'c';
const char *ptr = &ch; // A constant pointer 'ptr' pointing to 'ch'
*ptr = 'a';// WRONG!!! Cannot change the value at address pointed by
'ptr'.
return 0;
}
When the above code was compiled, the compiler will give the following error:
So, now we know the reason behind the error above (i.e) we cannot change the value pointed to
by a constant pointer.
C Pointer to Pointer
Till now, we have used or learned pointer to a data type like character, integer, etc. But in this
section, we will learn about pointers that are pointing to the other pointers.
As the definition of a pointer says that it is a special variable that can store the address of
another variable. Then the other variable can very well be a pointer. This means that it is
perfectly legal for a pointer to be pointing to another pointer.
Suppose we have a pointer ‘p1’ that points to yet another pointer ‘p2’ that points to a character
‘ch’. In memory, the three variables can be visualized as :
So, we can see that in memory. the pointer p1 holds the address of the pointer p2.
Pointer p2 holds the address of the character ‘ch’.
So ‘p2’ is a pointer to the character ‘ch’, while ‘p1’ is a pointer to ‘p2’ or we can also say that ‘p2’
is a pointer to pointer to the character ‘ch’.
So we see that ‘p1’ is a double pointer (ie pointer to a pointer to a character) and hence the
two *s in the declaration.
Now,
#include<stdio.h>
int main(void)
{
char **ptr = NULL;
char *p = NULL;
char c = 'd';
p = &c;
ptr = &p;
printf("\n c = [%c]\n",c);
printf("\n *p = [%c]\n",*p);
printf("\n **ptr = [%c]\n",**ptr);
return 0;
}
$ ./doubleptr
c = [d]
*p = [d]
**ptr = [d]
C Array of Pointers
Just like an array of integers or characters, there can be an array of pointers too.
<type> *<name>[<number-of-elements];
For example:
char *ptr[3];
#include<stdio.h>
int main(void)
{
char *p1 = "Embetronicx";
char *p2 = "Embedded";
char *p3 = "Tutorials";
char *arr[3];
arr[0] = p1;
arr[1] = p2;
arr[2] = p3;
return 0;
}
In the above code, we took three-pointers pointing to three strings. Then we declared an array
that can contain three-pointers. We assigned the pointers ‘p1’, ‘p2’, and ‘p3’ to the 0,1 and 2
indexes of the array. Let’s see the output :
$ ./arrayofptr
p1 = [Embetronicx]
p2 = [Embedded]
p3 = [Tutorials]
arr[0] = [Embetronicx]
arr[1] = [Embedded]
arr[2] = [Tutorials]
C Function Pointers
Just like pointers to characters, integers, etc, we can have pointers to functions.
For example:
The above line declares a function pointer ‘fptr’ that can point to a function whose return type
is ‘int’ and takes two integers as arguments.
#include<stdio.h>
return 0;
}
int main(void)
{
int(*fptr)(int,int); // Function pointer
func(2,3);
fptr(2,3);
return 0;
}
In the above example, we defined a function ‘func’ that takes two integers as inputs and returns
an integer. In the main() function, we declare a function pointer ‘fptr’ and then assign value to
it. Note that, the name of the function can be treated as starting address of the function so we
can assign the address of a function to the function pointer using the function’s name. Let’s see
the output :
$ ./fptr
a = 2
b = 3
a = 2
b = 3
From the above output, we see that calling the function through the function pointer produces
the same output as calling the function from its name.
If both methods are executing in the same way, then what is the use of this function pointer?
Now we can see the uses of the function pointer.
Function pointers can be useful when you want to create a callback mechanism and
need to pass the address of a function to another function.
They can also be useful when you want to store an array of functions, to call dynamically
for example.
I think this might helped you. If you have any doubts, please comment below. In our next
tutorial, we will discuss the different types of pointers.