Socket Programming
Socket Programming
Socket Programming
html
CS 60 Computer Networks
Lecture 3 and 4
Socket Programming
How do we build Internet applications? In this lecture, we will discuss the socket API and support for TCP
and UDP communications between end hosts. Socket programing is the key API for programming
distributed applications on the Internet.
BTW, Kurose/Ross only cover Java socket programming and not C socket programming discussed below.
Goals
What is a socket?
The client-server model
Byte order
TCP socket API
UDP socket API
Concurrent server design
The basics
Program A program is an executable file residing on a disk in a directory. A program is read into memory
and is executed by the kernel as ad result of an exec() function. The exec() has six variants, but we only
consider the simplest one (exec()) in this course.
Process An executing instance of a program is called a process. Sometimes, task is used instead of process
with the same meaning. UNIX guarantees that every process has a unique identifier called the process ID.
The process ID is always a non-negative integer.
File descriptors File descriptors are normally small non-negative integers that the kernel uses to identify the
files being accessed by a particular process. Whenever it opens an existing file or creates a new file, the
kernel returns a file descriptor that is used to read or write the file. As we will see in this course, sockets are
based on a very similar mechanism (socket descriptors).
The client-server model is one of the most used communication paradigms in networked systems. Clients
normally communicates with one server at a time. From a server’s perspective, at any point in time, it is not
unusual for a server to be communicating with multiple clients. Client need to know of the existence of and
the address of the server, but the server does not need to know the address of (or even the existence of) the
client prior to the connection being established
Client and servers communicate by means of multiple layers of network protocols. In this course we will
focus on the TCP/IP protocol suite.
The scenario of the client and the server on the same local network (usually called LAN, Local Area
Network) is shown in Figure 1
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 1/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
Figure 1: Client and server on the same Ethernet communicating using TCP/IP.
The client and the server may be in different LANs, with both LANs connected to a Wide Area Network
(WAN) by means of routers. The largest WAN is the Internet, but companies may have their own WANs.
This scenario is depicted in Figure 2.
The flow of information between the client and the server goes down the protocol stack on one side, then
across the network and then up the protocol stack on the other side.
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 2/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
UDP is a simple transport-layer protocol. The application writes a message to a UDP socket, which is then
encapsulated in a UDP datagram, which is further encapsulated in an IP datagram, which is sent to the
destination.
There is no guarantee that a UDP will reach the destination, that the order of the datagrams will be preserved
across the network or that datagrams arrive only once.
The problem of UDP is its lack of reliability: if a datagram reaches its final destination but the checksum
detects an error, or if the datagram is dropped in the network, it is not automatically retransmitted.
Each UDP datagram is characterized by a length. The length of a datagram is passed to the receiving
application along with the data.
No connection is established between the client and the server and, for this reason, we say that UDP provides
a connection-less service.
TCP provides a connection oriented service, since it is based on connections between clients and servers.
TCP provides reliability. When a TCP client send data to the server, it requires an acknowledgement in
return. If an acknowledgement is not received, TCP automatically retransmit the data and waits for a longer
period of time.
We have mentioned that UDP datagrams are characterized by a length. TCP is instead a byte-stream
protocol, without any boundaries at all.
TCP is described in RFC 793, RFC 1323, RFC 2581 and RFC 3390.
Socket addresses
IPv4 socket address structure is named sockaddr_in and is defined by including the <netinet/in.h> header.
struct in_addr{
in_addr_t s_addr; /*32 bit IPv4 network byte ordered address*/
};
struct sockaddr_in {
uint8_t sin_len; /* length of structure (16)*/
sa_family_t sin_family; /* AF_INET*/
in_port_t sin_port; /* 16 bit TCP or UDP port number */
struct in_addr sin_addr; /* 32 bit IPv4 address*/
char sin_zero[8]; /* not used but always set to zero */
};
A socket address structure is always passed by reference as an argument to any socket functions. But any
socket function that takes one of these pointers as an argument must deal with socket address structures from
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 3/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
A problem arises in declaring the type of pointer that is passed. With ANSI C, the solution is to use void *
(the generic pointer type). But the socket functions predate the definition of ANSI C and the solution chosen
was to define a generic socket address as follows:
struct sockaddr {
uint8_t sa_len;
sa_family_t sa_family; /* address family: AD_xxx value */
char sa_data[14];
};
Networking protocols such as TCP are based on a specific network byte order. The Internet protocols use
big-endian byte ordering.
#include <netinet/in.h>
The first two return the value in network byte order (16 and 32 bit, respectively). The latter return the value
in host byte order (16 and 32 bit, respectively).
The sequence of function calls for the client and a server participating in a TCP connection is presented in
Figure 3.
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 4/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
As shown in the figure, the steps for establishing a TCP socket on the client side are the following:
The steps involved in establishing a TCP socket on the server side are as follows:
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 5/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
Accept a connection with the accept() function system call. This call typically blocks until a client
connects with the server.
Send and receive data by means of send() and receive().
Close the connection by means of the close() function.
The first step is to call the socket function, specifying the type of communication protocol (TCP based on
IPv4, TCP based on IPv6, UDP).
#include <sys/socket.h>
where family specifies the protocol family (AF_INET for the IPv4 protocols), type is a constant described the
type of socket (SOCK_STREAM for stream sockets and SOCK_DGRAM for datagram sockets.
The function returns a non-negative integer number, similar to a file descriptor, that we define socket
descriptor or -1 on error.
The connect() function is used by a TCP client to establish a connection with a TCP server/
#include <sys/socket.h>
int connect (int sockfd, const struct sockaddr *servaddr, socklen_t addrlen);
The function returns 0 if the it succeeds in establishing a connection (i.e., successful TCP three-way
handshake, -1 otherwise.
The client does not have to call bind() in Section before calling this function: the kernel will choose both an
ephemeral port and the source IP if necessary.
The bind() assigns a local protocol address to a socket. With the Internet protocols, the address is the
combination of an IPv4 or IPv6 address (32-bit or 128-bit) address along with a 16 bit TCP port number.
#include <sys/socket.h>
where sockfd is the socket descriptor, servaddr is a pointer to a protocol-specific address and addrlen is the
size of the address structure.
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 6/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
This use of the generic socket address sockaddr requires that any calls to these functions must cast the
pointer to the protocol-specific address structure. For example for and IPv4 socket structure:
A process can bind a specific IP address to its socket: for a TCP client, this assigns the source IP address that
will be used for IP datagrams sent on the sockets. For a TCP server, this restricts the socket to receive
incoming client connections destined only to that IP address.
Normally, a TCP client does not bind an IP address to its socket. The kernel chooses the source IP socket is
connected, based on the outgoing interface that is used. If a TCP server does not bind an IP address to its
socket, the kernel uses the destination IP address of the incoming packets as the server’s source address.
Note, the local host address is 127.0.0.1; for example, if you wanted to run your echoServer (see later) on
your local machine the your client would connect to 127.0.0.1 with the suitable port.
The listen() function converts an unconnected socket into a passive socket, indicating that the kernel should
accept incoming connection requests directed to this socket. It is defined as follows:
#include <sys/socket.h>
where sockfd is the socket descriptor and backlog is the maximum number of connections the kernel should
queue for this socket. The backlog argument provides an hint to the system of the number of outstanding
connect requests that it should enqueue on behalf of the process. Once the queue is full, the system will reject
additional connection requests. The backlog value must be chosen based on the expected load of the server.
The accept() is used to retrieve a connect request and convert that into a request. It is defined as follows:
#include <sys/socket.h>
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 7/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
where sockfd is a new file descriptor that is connected to the client that called the connect(). The cliaddr
and addrlen arguments are used to return the protocol address of the client. The new socket descriptor has
the same socket type and address family of the original socket. The original socket passed to accept() is not
associated with the connection, but instead remains available to receive additional connect requests. The
kernel creates one connected socket for each client connection that is accepted.
If we don’t care about the client’s identity, we can set the cliaddr and addrlen to NULL. Otherwise, before
calling the accept function, the cliaddr parameter has to be set to a buffer large enough to hold the address
and set the interger pointed by addrlen to the size of the buffer.
Since a socket endpoint is represented as a file descriptor, we can use read and write to communicate with a
socket as long as it is connected. However, if we want to specify options we need another set of functions.
For example, send() is similar to write() but allows to specify some options. send() is defined as follows:
#include <sys/socket.h>
ssize_t send(int sockfd, const void *buf, size_t nbytes, int flags);
where buf and nbytes have the same meaning as they have with write. The additional argument flags is used
to specify how we want the data to be transmitted. We will not consider the possible options in this course.
We will assume it equal to 0.
The recv() function is similar to read(), but allows to specify some options to control how the data are
received. We will not consider the possible options in this course. We will assume it equal to 0.
#include <sys/socket.h>
ssize_t recv(int sockfd, void *buf, size_t nbytes, int flags);
The function returns the length of the message in bytes, 0 if no messages are available and peer had done an
orderly shutdown, or -1 on error.
The normal close() function is used to close a socket and terminate a TCP socket. It returns 0 if it succeeds,
-1 on error. It is defined as follows:
#include <unistd.h>
Figure 4 shows the the interaction between a UDP client and server. First of all, the client does not establish
a connection with the server. Instead, the client just sends a datagram to the server using the sendto function
which requires the address of the destination as a parameter. Similarly, the server does not accept a
connection from a client. Instead, the server just calls the recvfrom function, which waits until data arrives
from some client. recvfrom returns the IP address of the client, along with the datagram, so the server can
send a response to the client.
As shown in the Figure, the steps of establishing a UDP socket communication on the client side are as
follows:
The steps of establishing a UDP socket communication on the server side are as follows:
In this section, we will describe the two new functions recvfrom() and sendto().
This function is similar to the read() function, but three additional arguments are required. The recvfrom()
function is defined as follows:
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 9/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
#include <sys/socket.h>
The first three arguments sockfd, buff, and nbytes, are identical to the first three arguments of read and
write. sockfd is the socket descriptor, buff is the pointer to read into, and nbytes is number of bytes to read.
In our examples we will set all the values of the flags argument to 0. The recvfrom function fills in the
socket address structure pointed to by from with the protocol address of who sent the datagram. The number
of bytes stored in the socket address structure is returned in the integer pointed by addrlen.
This function is similar to the send() function, but three additional arguments are required. The sendto()
function is defined as follows:
#include <sys/socket.h>
ssize_t sendto(int sockfd, const void *buff, size_t nbytes,
int flags, const struct sockaddr *to,
socklen_t addrlen);
The first three arguments sockfd, buff, and nbytes, are identical to the first three arguments of recv. sockfd is
the socket descriptor, buff is the pointer to write from, and nbytes is number of bytes to write. In our
examples we will set all the values of the flags argument to 0. The to argument is a socket address structure
containing the protocol address (e.g., IP address and port number) of where the data is sent. addlen specified
the size of this socket.
Concurrent Servers
There are two main classes of servers, iterative and concurrent. An iterative server iterates through each
client, handling it one at a time. A concurrent server handles multiple clients at the same time. The simplest
technique for a concurrent server is to call the fork function, creating one child process for each client. An
alternative technique is to use threads instead (i.e., light-weight processes). We do not consider this kind of
servers in this course.
The fork() function is the only way in Unix to create a new process. It is defined as follows:
#include <unist.h>
pid_t fork(void);
The function returns 0 if in child and the process ID of the child in parent; otherwise, -1 on error.
In fact, the function fork() is called once but returns twice. It returns once in the calling process (called the
parent) with the process ID of the newly created process (its child). It also returns in the child, with a return
value of 0. The return value tells whether the current process is the parent or the child.
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 10/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
Example
pid_t pid;
int listenfd, connfd;
listenfd = socket(...);
bind(listenfd, ...);
listen(listenfd, ...);
for ( ; ; ) {
if ( (pid = fork()) == 0 ) {
close(connfd);
exit(0); /* child terminates
}
close(connfd); /*parent closes connected socket*/
}
}
When a connection is established, accept returns, the server calls fork, and the child process services the
client (on the connected socket connfd). The parent process waits for another connection (on the listening
socket listenfd. The parent closes the connected socket since the child handles the new client. The
interactions among client and server are presented in Figure 5.
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 11/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 12/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <string.h>
int
main(int argc, char **argv)
{
int sockfd;
struct sockaddr_in servaddr;
char sendline[MAXLINE], recvline[MAXLINE];
exit(0);
}
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 13/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <string.h>
for ( ; ; ) {
clilen = sizeof(cliaddr);
connfd = accept (listenfd, (struct sockaddr *) &cliaddr, &clilen);
printf("%s\n","Received request...");
if (n < 0) {
perror("Read error");
exit(1);
}
close(connfd);
}
//close listening socket
close (listenfd);
}
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 14/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <string.h>
//listen to the socket by creating a connection queue, then wait for clients
listen (listenfd, LISTENQ);
for ( ; ; ) {
clilen = sizeof(cliaddr);
//accept a connection
connfd = accept (listenfd, (struct sockaddr *) &cliaddr, &clilen);
printf("%s\n","Received request...");
if (n < 0)
printf("%s\n", "Read error");
exit(0);
}
//close socket of the server
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 15/16
2/5/2018 www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html
close(connfd);
}
}
http://www.cs.dartmouth.edu/~campbell/cs60/socketprogramming.html 16/16