Hyper Text Transfer Protocol

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 15

HTTP

Hyper Text Transfer Protocol


HTTP is a protocol which allows the fetching of resources, such as HTML documents. It is
the foundation of any data exchange on the Web and a client-server protocol, which means
requests are initiated by the recipient, usually the Web browser. A complete document is
reconstructed from the different sub-documents fetched, for instance text, layout description,
images, videos, scripts, and more.

Clients and servers communicate by exchanging individual messages (as opposed to a stream
of data). The messages sent by the client, usually a Web browser, are called requests and the
messages sent by the server as an answer are called responses.
Designed in the early 1990s, HTTP is an extensible protocol which has evolved over time. It
is an application layer protocol that is sent over TCP, or over a TLS-encrypted TCP
connection, though any reliable transport protocol could theoretically be used. Due to its
extensibility, it is used to not only fetch hypertext documents, but also images and videos or
to post content to servers, like with HTML form results. HTTP can also be used to fetch parts
of documents to update Web pages on demand.

Components of HTTP-based systems


HTTP is a client-server protocol: requests are sent by one entity, the user-agent (or a proxy
on behalf of it). Most of the time the user-agent is a Web browser, but it can be anything, for
example a robot that crawls the Web to populate and maintain a search engine index.

Each individual request is sent to a server, which will handle it and provide an answer, called
the response. Between this request and response there are numerous entities, collectively
designated as proxies, which perform different operations and act as gateways or caches, for
example.
In reality, there are more computers between a browser and the server handling the request:

Client: the user-agent


The user-agent is any tool that acts on the behalf of the user. This role is primarily performed
by the Web browser; a few exceptions being programs used by engineers, and Web
developers to debug their applications.
The Web server
On the opposite side of the communication channel, is the server which serves the document
as requested by the client. A server presents only as a single machine virtually: this is because
it may actually be a collection of servers, sharing the load (load balancing) or a complex
piece of software interrogating other computers (like cache, a DB server, e-commerce
servers, …), totally or partially generating the document on demand.

Due to the layered structure of the Web stack, most of these operate at either the transport,
network or physical levels, becoming transparent at the HTTP layer and potentially
making a significant impact on performance. Those operating at the application layers are
generally called proxies. These can be transparent, or not (changing requests not going
through them), and may perform numerous functions:
 caching (the cache can be public or private, like the browser cache)
 filtering (like an antivirus scan, parental controls, …)
 load balancing (to allow multiple servers to serve the different requests)
 authentication (to control access to different resources)
 logging (allowing the storage of historical information)

Basic aspects of HTTP


1. HTTP is simple
Even with more complexity, introduced in HTTP/2 by encapsulating HTTP messages into
frames, HTTP is generally designed to be simple and human readable. HTTP messages can
be read and understood by humans, providing easier developer testing, and reduced
complexity for new-comers.

2. HTTP is extensible
Introduced in HTTP/1.0, HTTP headers made this protocol easy to extend and experiment
with. New functionality can even be introduced by a simple agreement between a client and a
server about a new header's semantics.
3. HTTP is stateless, but not sessionless
HTTP is stateless: there is no link between two requests being successively carried out on the
same connection. This immediately has the prospect of being problematic for users
attempting to interact with certain pages coherently, for example, using e-commerce
shopping baskets. But while the core of HTTP itself is stateless, HTTP cookies allow the use
of stateful sessions. Using header extensibility, HTTP Cookies are added to the workflow,
allowing session creation on each HTTP request to share the same context, or the same state.

HTTP and connections


A connection is controlled at the transport layer, and therefore fundamentally out of scope for
HTTP. Though HTTP doesn't require the underlying transport protocol to be connection-
based; only requiring it to be reliable, or not lose messages (so at minimum presenting an
error). Among the two most common transport protocols on the Internet, TCP is reliable and
UDP isn't. HTTP subsequently relies on the TCP standard, which is connection-based, even
though a connection is not always required.

HTTP Messages
HTTP/1.1 and earlier HTTP messages are human-readable. In HTTP/2, these messages are
embedded into a new binary structure, a frame, allowing optimizations like compression of
headers and multiplexing. Even if only part of the original HTTP message is sent in this
version of HTTP, the semantics of each message is unchanged and the client reconstitutes
(virtually) the original HTTP/1.1 request. It is therefore useful to comprehend HTTP/2
messages in the HTTP/1.1 format.

There are two types of HTTP messages, requests and responses, each with its own format.

Requests
An example HTTP request:

Requests consists of the following elements:

 An HTTP method, usually a verb like GET, POST or a noun like OPTIONS or HEAD that
defines the operation the client wants to perform. Typically, a client wants to fetch a resource
(using GET) or post the value of an HTML form (using POST), though more operations may
be needed in other cases.
 The path of the resource to fetch; the URL of the resource stripped from elements that are
obvious from the context, for example without the protocol (http://),
the domain (here developer.mozilla.org), or the TCP port (here 80).
 The version of the HTTP protocol.
 Optional headers that convey additional information for the servers.
 Or a body, for some methods like POST, similar to those in responses, which contain the
resource sent.
Responses
An example responses:

Responses consist of the following elements:

 The version of the HTTP protocol they follow.


 A status code, indicating if the request has been successful, or not, and why.
 A status message, a non-authoritative short description of the status code.
 HTTP headers, like those for requests.
 Optionally, a body containing the fetched resource.
How HTTP Works
HTTP is an application layer protocol built on top of TCP that uses a client-server
communication model.
HTTP clients and servers communicate via HTTP request and response messages. The three
main HTTP message types are GET, POST, and HEAD.
 HTTP GET messages sent to a server contain only a URL. Zero or more optional
data parameters may be appended to the end of the URL. The server processes the
optional data portion of the URL, if present, and returns the result (a web page or
element of a web page) to the browser.
 HTTP POST messages place any optional data parameters in the body of the request
message rather than adding them to the end of the URL.
 HTTP HEAD request works the same as GET requests. Instead of replying with the
full contents of the URL, the server sends back only the header information
(contained inside the HTML section).
The browser initiates communication with an HTTP server by initiating a TCP connection to
the server. Web browsing sessions use server port 80 by default although other ports such as
8080 are sometimes used instead.
Once a session is established, the user triggers the sending and receiving of HTTP messages
by visiting the web page.
SMTP
Simple Mail Transfer Protocol
SMTP stands for Simple Mail Transfer Protocol. It is a TCP/IP protocol that specifies how
computers exchange electronic mail. It works with post office protocol (POP). SMTP is used
to upload mail directly from the client to an intermediate host, but only computers constantly
connected such as Internet Service Providers (ISP) to the Internet can use SMTP to receive
mail. The ISP servers then offload the mail to the users to whom they provide the Internet
service.
SMTP uses TCP port number 25 for his service. Therefore e-mail is delivered from source to
destination by having the source machine established a TCP connection to port 25 of the
destination machine.
To send a mail, a system must have a client MTA, and to receive a mail, a system must have
a server MTA. SMTP transfers this message from client MTA to server MTA. SMTP uses
commands and responses to transfer the message between an MTA client and MTA server In
order to send a mail, SMTP is used two times: one between the sender and the sender's mail
server, and the other between the two mail servers.
Each command or response ends with two characters (CR and LF) CR stands for Carriage
Return and LF stands for Line Feed.
Windows NT Option Pack 4 includes an SMTP mail client so do the Windows NT Resource
Kit. Microsoft Exchange Server will route your LAN mail on and off the Internet.
Working of SMTP: SMTP is a simple ASCII protocol that is based on client-server model.
After establishing the TCP connection, the sending machine, operating as the client, waits for
the receiving machine, operating as the server, to talk first. The server starts by sending a line
of text giving its identity and telling whether or not it is prepared to receive mail. If it is not,
the client releases the connection and tries again later.
If the server is willing to accept e-mail, the client announces whom the e-mail is coming from
and destination, the server gives the client the go ahead to send the message. Then the client
sends the message and the server acknowledges it.
The problems that may arise with SMTP protocol are as follows:
Some older version of SMTP implementations cannot handle messages exceeding 64KB.
If the client and server have different timeouts, one of them may give up while the other is
still busy, unexpectedly terminating the connection.
To get around these problems, extended SMTP (ESMTP) has been defined in RFC 1425.

SMTP Commands
• SMTP commands are sent from the client to the server.
• Each command consists of a keyword or command name followed by zero or more
argument. It means some keywords do not contain any argument.
• The format of command is:
Keyword: argument(s)
• There are 14 different SMTP commands listed in the table below:
SMTP Responses
• SMTP responses are sent from server to client.
• Each response begins with a three digit code and may be followed by additional
textual information.
• The leading digits indicate the category of the response.
The difference categories of response are:
1. Positive completion reply. It indicates that the requested action has been successfully
completed. A new request may be initiated.
2. Positive Intermediate reply. It indicates that the command has been accepted, but the
requested action is being held in abeyance, pending receipt of further information.
3. Transient Negative Completion reply. It indicates that the command was not accepted
and the requested action did not occur. However, the error condition is temporary and the
action may be requested.
4. Permanent Negative Completion reply. It indicates the command was not accepted and
the requested action did not occur.

The various SMTP responses are listed in the table below:

Mail Transfer Phases


The basic SMTP operation occurs in three phases:

1. Connection set up
2. Mail transfer
3. Connection termination
Connection Setup
An SMTP sender will attempt to set up a TCP connection with a target host when it has one
or more mail message to deliver to that host. The following sequence occurs during
connection setup:

1. The sender opens a TCP connection with the receiver.


2. Once the connection is established, the receiver identifies itself with '220 Service Ready".
3. The sender identifies itself with the HELO command.
4. The receiver accepts the sender's identification with "250 'OK".
5. If the mail service on the destination is not available, the destination host returns a "421
Service Not Available" reply in step 2 and the process is terminated.
Mail transfer
• Once the connection has been established, the SMTP sender may send one or more
messages to the SMTP receiver.
• There are three logical phases to the transfer of a message :
1. A MAIL command identifies the originator of the message.
2. One or more RCPT commands identify the recipients of this message.
3. A DATA command transfers the message text.
Connection termination
• The SMTP sender closes the connection in the following manner:
1. The sender sends a QUIT command and waits for a reply.
2. Sender initiates TCP close operation for the TCP connection.
3. The receiver initiates its TCP close after sending is reply to the QUIT command.

SMTP Commands
The SMTP standard defines a set of commands - names of specific types of messages that
mail clients to the mail server when requesting information. The most commonly used
commands are:
 HELO and EHLO - commands that initiate a new protocol session between client
and server. The EHLO command requests them to respond with any optional SMTP
extensions it supports
 MAIL - command to initiate sending an email message
 RCPT - command to provide one email address for a recipient of the current message
being prepared
 DATA - command indicating the start of transmission of the email message. This
command initiates a series of one or more follow-on messages each containing a piece
of the message. The last message in the sequence is empty (containing only a period
(.) as a termination character) to signify the end of the email.
 RSET - while in the process of sending an email (after issuing the MAIL command),
either end of the SMTP connection can reset the connection if it encounters an error
 NOOP - an empty ("no operation") message designed as a kind of ping to check for
responsiveness of the other end of the session
 QUIT - terminates the protocol session
The recipient of these commands replies with either success or failure code numbers.
IMAP
Internet Message Access Protocol
Definition
IMAP is an internet standard that describes a protocol for retrieving mail from an email
(IMAP) server.
What Can IMAP Do?
Typically, messages are stored and organized in folders on the server. Email clients on
computers and mobile devices replicate that structure, at least in part, and synchronize actions
(such as deletion or moving messages) with the server.
That means multiple programs can access the same account and all show the same state and
messages, all synchronized.
It allows you to move messages between email accounts seamlessly, have third-party services
connect to your account to add functionality (for example, to automatically sort or back up
messages).
IMAP is an acronym for Internet Messaging Access Protocol, and the protocol’s current
version is IMAP 4 (IMAP4rev1).

 IMAP allows the client program to manipulate the e-mail message on the server
without downloading them on the local computer.
 The e-mail is hold and maintained by the remote server.
 It enables us to take any action such as downloading, delete the mail without reading
the mail.It enables us to create, manipulate and delete remote message folders called
mail boxes.
 IMAP enables the users to search the e-mails.
 It allows concurrent access to multiple mailboxes on multiple mail servers
IMAP Commands
The following table describes some of the IMAP commands:

S.N. Command Description

IMAP_LOGIN
1
This command opens the connection.

CAPABILITY
2
This command requests for listing the capabilities that the server supports.

NOOP
3 This command is used as a periodic poll for new messages or message status updates during a
period of inactivity.

SELECT
4
This command helps to select a mailbox to access the messages.
EXAMINE
5
It is same as SELECT command except no change to the mailbox is permitted.

CREATE
6
It is used to create mailbox with a specified name.

DELETE
7
It is used to permanently delete a mailbox with a given name.

RENAME
8
It is used to change the name of a mailbox.

LOGOUT
9 This command informs the server that client is done with the session. The server must send BYE
untagged response before the OK response and then close the network connection.

Simple Message Transfer Protocol (SMTP) is what your email client (e.g. Gmail,
Thunderbird, Outlook, etc.) uses to send your email messages to your email server. The email
server is often hosted by your email service provider, for instance Google, but it can be also
hosted by your Internet service provider (most often by the same one that hosts your domain).
Next, SMTP is also used by the email server to send your message to the mailbox of your
recipient’s email server. From there, the recipient’s email client can fetch your message
using Internet Message Access Protocol (IMAP) and put it in their inbox, where they can
access it.
RPC
Remote Procedure Call (RPC)

Remote Procedure Call (RPC) is a powerful technique for constructing distributed, client-
server based applications. It is based on extending the conventional local procedure calling,
so that the called procedure need not exist in the same address space as the calling
procedure. The two processes may be on the same system, or they may be on different
systems with a network connecting them.

When making a Remote Procedure Call:

1. The calling environment is suspended, procedure parameters are transferred across the
network to the environment where the procedure is to execute, and the procedure is executed
there.
2. When the procedure finishes and produces its results, its results are transferred back to the
calling environment, where execution resumes as if returning from a regular procedure call.
Working of RPC

The following steps take place during a RPC:


1. A client invokes a client stub procedure, passing parameters in the usual way. The client
stub resides within the client’s own address space.
2. The client stub marshalls(pack) the parameters into a message. Marshalling includes
converting the representation of the parameters into a standard format, and copying each
parameter into the message.
3. The client stub passes the message to the transport layer, which sends it to the remote
server machine.
4. On the server, the transport layer passes the message to a server stub,
which demarshalls(unpack) the parameters and calls the desired server routine using the
regular procedure call mechanism.
5. When the server procedure completes, it returns to the server stub (e.g., via a normal
procedure call return), which marshalls the return values into a message. The server stub
then hands the message to the transport layer.
6. The transport layer sends the result message back to the client transport layer, which hands
the message back to the client stub.
7. The client stub demarshalls the return parameters and execution returns to the caller.
RPC ISSUES
 Issues that must be addressed:
1. RPC Runtime: RPC run-time system, is a library of routines and a set of services that
handle the network communications that underlie the RPC mechanism. In the course of an
RPC call, client-side and server-side run-time systems’ code handle binding, establish
communications over an appropriate protocol, pass call data between the client and
server, and handle communications errors.
2. Stub: The function of the stub is to provide transparency to the programmer-written
application code.
On the client side, the stub handles the interface between the client’s local procedure call
and the run-time system, marshaling and unmarshaling data, invoking the RPC run-time
protocol, and if requested, carrying out some of the binding steps.
On the server side, the stub provides a similar interface between the run-time system and the
local manager procedures that are executed by the server.
3. Binding: How does the client know who to call, and where the service resides?
The most flexible solution is to use dynamic binding and find the server at run time when the
RPC is first made. The first time the client stub is invoked, it contacts a name server to
determine the transport address at which the server resides.
Binding consists of two parts:
 Naming:
Remote procedures are named through interfaces. An interface uniquely identifies a
particular service, describing the types and numbers of its arguments. It is similar in
purpose to a type definition in programming languauges.
 Locating:
Finding the transport address at which the server actually resides. Once we have the transport
address of the service, we can send messages directly to the server.
A Server having a service to offer exports an interface for it. Exporting an interface registers
it with the system so that clients can use it.
A Client must import an (exported) interface before communication can begin.
ADVANTAGES
1. RPC provides ABSTRACTION i.e message-passing nature of network communication is
hidden from the user.
2. RPC often omits many of the protocol layers to improve performance. Even a small
performance improvement is important because a program may invoke RPCs often.
3. RPC enables the usage of the applications in the distributed environment, not only in the
local environment.
4. With RPC code re-writing / re-developing effort is minimized.
5. Process-oriented and thread oriented models supported by RPC.
TELNET.
TELNET is an abbreviation for TErminaL NETwork. It is the standard TCP/IP protocol for
virtual terminal service as proposed by the International Organization for Standards (ISO).
TELNET enables the establishment of a connection to a remote system in such a way that the
local terminal appears to be a terminal at the remote system. TELNET is a general-purpose
client/server application program. TELNET was designed at a time when most operating
systems, such as UNIX, were operating in a timesharing environment. In such an
environment, a large computer supports multiple users. The interaction between a user and
the computer occurs through a terminal, which is usually a combination of keyboard,
monitor, and mouse. Even a microcomputer can simulate a terminal with a terminal emulator.
In a timesharing environment, users are part of the system with some right to access
resources. Each authorized user has an identification and probably, a password. The user
identification defines the user as part of the system. To access the system, the user logs into
the system with a user id or log-in name. The system also includes password checking to
prevent an unauthorized user from accessing the resources. Figure shows the logging process.

When a user logs into a local timesharing system, it is called local log-in. As a user types at a
terminal or at a workstation running a terminal emulator, the keystrokes are accepted by the
terminal driver. The terminal driver passes the characters to the operating system. The
operating system, in turn, interprets the combination of characters and invokes the desired
application program or utility. When a user wants to access an application program or utility
located on a remote machine, she performs remote log-in. Here the TELNET client and
server programs come into use. The user sends the keystrokes to the terminal driver, where
the local operating system accepts the characters but does not interpret them. The characters
are sent to the TELNET client, which transforms the characters to a universal character set
called network virtual terminal (NVT) characters and delivers them to the local TCP/IP
protocol stack. The commands or text, in NVT form, travel through the Internet and arrive at
the TCP/IP stack at the remote machine. Here the characters are delivered to the operating
system and passed to the TELNET server, which changes the characters to the corresponding
characters understandable by the remote computer. However, the characters cannot be passed
directly to the operating system because the remote operating system is not designed to
receive characters from a TELNET server: It is designed to receive characters from a terminal
driver. The solution is to add a piece of software called a pseudoterminal driver which
pretends that the characters are coming from a terminal. The operating system then passes the
characters to the appropriate application program.

You might also like