Nimrod
A New Routing and Addressing Architecture for the Internet
J. Noel Chiappa
"In anything at all, perfection is finally attained
not when there is no longer anything to add, but
when there is no longer anything to take away..."
-- Antoine de Saint Exupery, "Wind, Sand and Stars"
(quoted in "Multics: The First Seven Years",
F. J. Corbato, J. H. Saltzer)
"One ring to rule them all, one ring to find them, one ring to bring them
all, and in the darkness bind them."
-- J. R. R. Tolkien, "Lord of the Rings"
Introduction
What is Nimrod about?
Nimrod is a routing architecture, not a routing protocol;
done by a system architect, not an algorithmic expert.
In other words, don't expect to see:
- New path-finding (routing) algorithms
- New abstraction algorithms
Expect rather to see:
- Ideas on what the overall functionality of routing ought
to be
- Ideas on how routing ought to interact with the rest
of the internetwork layer
- Ideas on how to organize the internals of the routing functionality
Internet Evolutionary Needs
The Internet was not designed to grow to its current size or
importance. The Internet desperately needs several things:
- A new addressing architecture
- Recognition and explicit naming of some new fundamental abstractions
- A much improved routing architecture
- A resource allocation architecture
- A security architecture
- Accounting
Routing and Addressing Architectures
What are "routing and addressing architectures"?
An addressing architecture includes:
- A system of naming networks, interfaces, topology aggregates, etc
A routing architecture is not simply a protocol (i.e. a
set of packet formats, and instructions on how to process them). Rather, it
considers the entire subject of how the network organizes the handling of
user traffic.
A routing architecture includes:
- A way of exchanging information as to where these named things are
- A way of computing and selecting paths between sets of communicating
entities
- A way of causing traffic to take these paths
Examples and Non-Examples of Routing Architectures
Routing protocols:
- BGP is a routing protocol
- iBGP is a routing protocol
- OSPF is a routing protocol
- RIP is a routing protocol
Routing architectures:
- Nimrod is a routing architecture
- The entire collection of BGP, etc, along with the specifications for
how they interact, is a routing architecture (albeit an incredibly poorly
organized and lashed-together one, not to mention one without many desirable
capabilities)
Architectural Discussion Fundamentals
Confusion in Terminology and Concepts
I am far from thinking that nomenclature is a remedy for every defect in art
or science: still I cannot but feel that confusion of terms generally springs
from, and always leads to, confusion of ideas.
-- John Louis Petit, Architectural Studies in France, 1854
Much of the confusion in discussion of routing results from varying meanings
for terms, particularly address. This happened, in part, because the
set of fundamental objects recognized was not rich enough.
Thus, difficulties have arisen when solutions to various problems demanded
that objects which were previously hidden "inside" or "behind" other objects
be recognized as separate entities.
Names and Objects
We must also separate out the name for a thing from the
object, the thing itself, to which those names refer.
We must also separate out the form of the name for a thing from
the generic concept of a name for the thing.
One of more possible names or types of names can exist for each object,
and different names may or may not have structure. The structure is there
to help something else do its job.
Objects
- Interface:
- A place to which a producer or consumer of packets connects to
the network; a network attachment point
- The place towards which the system of routers directs packets
- Network:
- A collection of interfaces which have some useful relationship:
- Any interface can send directly to any other without going
through a router
- A topology aggregate
- Route or Path:
- A path from one place in the network to another
- Host:
- An actual machine which is the source or destination of traffic,
through some interface
- Router:
- A device which is interconnecting various elements of the network,
and forwarding traffic
- Node:
- A host or router
- In Nimrod technical terminology, an element of a Nimrod map (which
is represented as a graph, with nodes and arcs)
Names
- Address:
- A structured name for a network interface or topology aggregate:
- The structure is used by the routing to help it scale
- Topologically related items have to be given related addresses
Topologically related addresses also:
- Allow the number of destinations tracked by the routing
to be minimized
- Allow quick location of the named interface on a map
- Provide a representation for topology distribution process
- Provide a framework for the abstraction process
- DNS names:
- A structured human usable name for a host, etc
- The structure is used to facilitate the distribution and lookup
New Terminology and Concepts
A new routing architecture should add a new object, and a new naming
space for them, to the packet layer: the endpoint.
An endpoint is:
- Approximately, the entity at one end of a TCP connection
- A fate-sharing region
- An end-end entity
- Can be hosts, mobile processes, etc...
Routing Fundamentals
Classes of Routing Architectures
In one of the two fundamental divisions of routing architectures, paths
can be calculated either:
- By each node along the route in turn, called hop-by-hop;
i.e. each node makes a wholly independent decision as to which
next hop a given packet will take.
- In a localized way, by the source or an agent thereof), often
called source-routing, but also referred to as explicit
or unitary.
(Note: these latter terms, explicit and unitary, are to be preferred to
source routing, as this is often taken to have the limited meaning
that the host sending the packet must compute the entire route, which is then
included in every packet.)
There is also a complex spectrum of possibilities between these endpoints.
Along another axis, there are two broad classes of routing algorithms:
- Destination Vector (DV) algorithms, where nodes pass sets of
(destination, information) n-tuples to immediate neighbours, and this
information is used to construct routing tables.
They are characterized by:
- The computation of the route from a given source to a given destination
is evenly distributed; i.e. at any given level of abstraction,
no node has a more significant responsibility in selecting the path
- Intermediate results in that computation are passed between independent
nodes
- Map Distribution (MD) algorithms, where nodes flood information about
their immediate surroundings through the network; this information is used to
construct topology maps. They are characterized by:
- The computation of the route can be either localized or hop-by-hop, but
in either case it is not fully distributed
- No intermediate results are passed around
- The routing computation may easily be delayed until the path is
needed
Advantages and Disadvantages of Various Routing Architectures
Hop-by-hop architectures have the following disdvantages:
- They effectively require global consistency in
the routing databases to prevent routing loops.
Explicit architectures have the following advantages:
- They trivially allow the testing and deployment of new path-selection
algorithms
- They can allow the users much more control over the path of
their traffic
- They are more naturally immune to loops
- They can be made much more robust
- They can be secured against almost all attacks
DV architectures have the following advantages:
- They require less computing and storage overhead than MD
MD architectures have the following advantages:
- They are much more practical to secure against many attacks
- They can react more quickly to changes (in topology, etc)
- Their stabilization time is not only shorter on average, but
also much more bounded
- They require less bandwidth to react to significant changes in
network topology
- Policy routing (which includes most QOS routing) is only practical with
MD architectures
- Highly secure routing is only practical with MD architectures
The Current Internet Routing Architecture
The current Internet routing architecture has the following key
characteristics:
- Destination Vector type
- Hop-by-hop routed
Do not be misled by the deployment of IGP's which use MD architectures,
such as OSPF and IS-IS, in limited areas of the network. The
overall operation of the current Internet routing architecture is DV:
- The data passed between AS's consists of routing tables, not maps
- The path across a number of AS's is selected piecemeal, not in a
unitary fashion
Tradeoffs in Large Scale Routing
In a large network, "good routes" are not a simple, fixed goal. Rather,
there is a complex tradeoff at work:
- In any large network, you have to discard detailed routing information,
or incur the overhead cost of massive amounts of potentially unneeded detail.
- This process is called abstraction.
- This process can be divided into two classes:
- Compression, in which the same routing decision is made in all
cases after the abstraction as before
- Thinning, in which the routing is affected
- In real networks where abstraction is needed, thinning is generally
required, particularly in an environment rich in policies and
link attributes (e.g. bandwidth, delay, etc).
- If the routing was previously optimal, discarding routing information
via thinning means non-optimal routes, i.e. costs for traffic taking extra
hops, or failure to find a policy acceptable route which does exist but is
not visible, etc.
- Routing in large networks thus incurs two kinds of overhead cost:
- The cost of running the routing
- The cost of non-optimal routes
- The challenge of routing in very large networks is managing this
choice of costs.
- This choice is not a purely technical issue, but rather a cost/benfit
policy issue, so it is a trade-off we must to some degree leave open in the
design.
Addresses, Topology and Routing
There are subtle connections between the topology of a network (i.e. how it
is connected together), the abstraction hierarchy (i.e. hierarchical
addressing structure) used to create addresses for it, and the routes
that are generated:
- An abstraction hierarchy is loosely isomorphic to a topology.
- The abstraction hierarchy is chosen to minimize the sum of the two
costs of routing overhead (i.e. direct costs, and non-optimal path costs),
for a given topology.
- The abstraction hierarchy chosen does have subtle implications on
which paths are discovered, and thus, used
- As the topology changes over time, the cost (i.e. the sum of the
two costs) is no longer minimal.
- In the long run, you have to change addresses. This should be
dynamic, and semi-automatic.
Nimrod Design Goals and Principles
Technical Goals for Nimrod
Nimrod has an extensive list of goals, but this one tops the list:
- Maximize the lifetime of the architecture
This is by far and away the most important goal. To understand why, let's
look at why large-scale communication networks - and the design philosophy
that must be used in designing them - are inherently fundamentally different
from all other large-scale software systems.
Design Philosophy for Large-Scale Communication Networks
- The design philosophy used for these systems is inherently different
from that used in other large software systems.
- The need to communicate means that simultaneous use of alternative
solutions is not really viable, as it is in all other software systems:
e.g. operating systems, editors, languages, etc.
- This need to communicate means that if adding new capabilities
means switching to a significantly different architecture, change will very
difficult, as upward compatability, which is limiting, must
always be observed.
- The sheer scale, and difficulty, of change in such systems means that
major changes are very expensive and painful.
- Thus, to minimize costs over the entire life-cycle of an
large-scale communication network architecture, maximizing the life-time
of the architecture is the key goal.
Devolution
As fate-sharing was perhaps the key concept behind the last
generation of networking architecture, the key concept of the next generation
of networking architecture will probably be devolution.
Devolution, simply put, means that we will never do in the core of the
network anything we can move to the edge. In other words, the network won't
try and make decisions for the users, but rather will hand the responsibility
over to the users.
This is related to the concept of separating policy from mechanism.
The relocation of mechanism, from the network, into the hosts, is a key
part of maximizing the lifetime of the architecture. The less the network
does, the easier it will be to change and upgrade functionality.
Architectural Lifetimes and Engineering Viability
There is an inherent tension between producing an architecture with a useful
lifetime (and how often have we seen crippling costs when an architecture
runs into its limits), and producing an architecture which is viable, in
engineering terms, at the time it is first deployed.
Overhead costs, etc, which seem trivial toward the end of the lifetime of an
architecture (typically twenty years, or more) seem infeasible at the time
the architecture is deployed.
However, without the boldness to pay high costs up front, to allow adoption
of an architecture with more room for growth over the long run, down the
line the architecture will be crippled by limitations imposed in order to
produce engineering savings with only short utility.
There is no easy solution to this problem.
Additional Technical Goals for Nimrod
Nimrod has many other goals, which are ancillary to simply maximizing
the lifetime:
- An incrementally/interoperably deployable system - without this,
it cannot be deployed.
- Support efficient datagram mode service; i.e. the ability of a
user to send packets without requiring any prior network setup and at low
overhead.
- Scale indefinitely in terms of number of networks, routers, etc. The
expected worldwide network system size is so large that optimizations
based on a fixed maximum size aren't worth the limitations.
- Provide a very robust system, one which is proof against all kinds
of failures as well as attacks.
- Provide flexible firewalls which allows the users to enforce
as much security on the routing as they need.
- Support policy routing. Policy does not mean just
restrictions on traffic; it includes all routing decisions (such as QOS) not
made on a simple metric or simple combination of a few metrics. Policies can
be divided into the following classes:
- Restrictions on which traffic can use particular transmission
resources (also called transit policies, or access control).
- Restrictions on which resources can be used by particular traffic
(also called source policies or the trust model).
- Restrictions on who can find out the details about certain parts
of the network (also called information hiding)
An alternative term to policy routing which is used a lot now is
constraint-based routing.
- Minimize the hand-configuration required of users, since this is a
source of potential errors, as well as being tedious and complex.
Design Principles for Nimrod
- Do the best possible design, and then work out how to deploy it;
don't simply imcrementally improve the existing system.
- Maximize the lifetime by:
- Making the design as flexible as possible
- Minimizing the mandatory, system wide part of the architecture
- Maximize robustness by:
- Reducing the dependencies among various components
- Use of redundant mechanisms where plausible
- Break the routing up into a number of subsystems, with the
interfaces between them visible, and available, to the users.
Nimrod Architecture Overview
Nimrod Fundamental Architecture
The Nimrod routing architecture has the following key characteristics:
- Map Distribution type
- Explicitly routed
This combination has a number of advantages:
- Allows maximal overall flexibility (i.e. system lifetime)
- Is securable against explicit attack
- Allows maximal robustness
- Interacts well with flexible abstraction mechanisms
- Allows policy routing (allows complete source control over packet
forwarding path).
- Allows maximally flexible policy routing
- Interacts well with resource allocation systems
It also includes the following key concepts:
- The fundamental mechanism for forwarding user data packets is
the flow
- Uses hierarchical abstraction to allow and manage scaling
Note: The path-selection algorithm is not part of the protocol
(although a sample algorithm can be given in an appendix).
Additional Nimrod Architectural Points
The Nimrod routing architecture is also characterized by the following
concepts:
- Provision for efficient handling of datagram service mode packets,
using a mesh of Datagram Mode Flows (DMF's) which are maintained at all
times
- Flexible abstraction techniques:
- Compression + Thinning ==> Abstraction
- Thinning is needed for practicality, but causes non-optimal
routes
- Nimrod allows users to pick their own balance between these two
costs
- Abstraction algorithms (outgoing and incoming) are not part of
the spec
- Complete routing system, down to the lowest levels, with no
simplistic IGP/EGP split
Reflections on the Architecture
- Note what is not part of the architecture:
- Path-selection algorithm
- Outgoing abstraction algorithm
- Incoming abstraction algorithm (Local Abstraction Control)
- Key advantages of this approach:
- Limits amount of work to be done
- Allows future improvement
- Allows easy deployment of new algorithms
- Reduces scope for errors and bad design by minimizing global
mechanisms
- What is part of the architecture:
- Method of representing topology
- Method of distributing topology information
- Method of setting up flows, and handling datagram mode packets
Overview of Nimrod Mechanisms
Addresses
Nimrod uses, and allow use of, new addresses (something
else the Internet needs), called locators, using a syntax and
semantics which will hopefully never need to be reworked. Their
characteristics are:
- Variable length
- Variable number of levels
- Addresses can name topology aggregates as well as individual
network elements
- Addresses are not intended to descend from a fixed root, but will
instead by built in a natural "bottom-up" style, as larger and larger
addressing abstractions are defined.
Note that it is not intended that all packets will carry these addresses;
packets which are using the flow service mode need only carry
flow identifiers.
Maps
Connectivity information in the form of maps is the key data item
in Nimrod.
Maps:
- Are the data which routers pass around
- Form the database which is used to selects paths
Nimrod maps consist of:
- Nodes (which have an open-ended list of attributes)
- Arcs (which are uni-directional links which connect nodes, and
cannot have attributes; they simply indicate connectivity)
In general, it is not required that different routers have consistent maps.
More About Maps
A node in a map represents some part of the physical network. It may be as
large or as small as desired: a node can represent an entire ISP, or a single
interface.
A contiguous portion of the graph which represents the entire physical
network may be represented by a node, or set of nodes, which are an
abstraction for that part of the network.
Abstractions can be recursive.
The entities responsible for advertising the abstraction for an area of the
physical network get to chose what abstraction for that area they wish to
hand out.
Each node which is using maps to select paths can, within limits, decide for
iself how much detail it wants to see in its map. If the information it
gets by default is not detailed enough, it can go out and ask for more.
Attributes
The nodes in the map may be characterized by an open-ended (and locally
extensible) list of attributes, which can include many different kinds
of information. Attributes might include:
- Bandwidth
- Delay
- Delay variance
- Error rate
- Cost
- Allowed users (e.g. government only)
Each node also has some inherent attributes, i.e. ones which each node
must have:
- The locator of the node
- The connectivity of the node
Path Selection
Conceptually, path selection is done by the users of the network, i.e. the
hosts. However, many hosts will not want to deal with keeping maps,
running path-selection algorithms, etc. It is thus expected that most hosts
will get paths from route servers.
Route servers have the advantage that they form a central point at which
an organization can express organizational policies with regard to path
preferences through the network outside the organization.
Path selection across a large graph, in the presence of multiple constraints,
is a difficult problem, and will probably be the subject of research for
decades to come.
User Data Forwarding
The plan is for Nimrod to support four basic modes for handling user data
traffic:
- Flow mode, where the user does a prior flow setup (which is also
part of Nimrod), which can include arbitrarily complex arrangements for
resource allocation, security, billing, etc; packets then carry only
flow-identifiers for doing forwarding.
- Node Chain mode, where packets carry a list of nodes, and the
packet is required to go through the nodes listed, which should define a
continuous path across the network.
(This is basically strict source route.)
- Node Sequence mode, where packets carry a list of nodes, but
the list does not define a continuous path across the network, merely points
the packet has to travel through.
(This is basically loose source route.)
- Datagram mode, where every packet header carries source and
destination locators. (This is basically normal IPv4-type forwarding.)
The first and last modes are intended to be the ones principally used; the
others are for special situations, fault isolation, critical network
operations, etc.
Datagram Mode
Flows are the fundamental entity in Nimrod, and no packet travels anywhere
except in a flow. To provide Datagram mode service, Nimrod routers will
assign a packet to a sequence of Datagram Mode Flows (DMF's), which are
relatively short-distance flows which are set up specifically to handle
Datagram mode packets.
The network is completely spanned by a mesh of DMF's. The packet traverses
the network by being assigned to a sequence of DMF's, a sequence which is
specifically selected to move the packet towards its eventual destination.
Datagram mode packet headers contain:
- A locally usable path-id field
- Source and destination locators
- A pointer into the locators
Datagram Mode Operation
While in a DMF, the forwarding routers do not examine the locators, only the
flow-id.
Only active routers (i.e., one that actually makes a decision about
where to send the packet, after it has exited a DMF - as opposed to handling
it as part of a flow) look at the locators.
At each active router, the router examines the locators in the header to see
where to send the packet next, assigns the packet to the appropriate flow,
and sends the packet off.
The pointer is initialized to point at the lowest level of the source
locator. From there, it moves up that locator, then to the destination
locator, and then down.
While going up the source locator, each active router selects a DMF which
will take the packet to the next higher level object in the source
locator, and then advances the pointer.
The process repeats, until the packet reaches a router which represents an
abstraction which is at the least common intersection of the two locators.
(E.g., for A.P.Q.R and A.X.Y.Z, this would be when the packet reaches A).
The process then inverts, with each active router selecting a DMF which
takes the packet to the next lower object in the destination locator. So, A
would select a flow to A.X, and once it got to A.X, A.X would select a flow
to A.X.Y, etc.
Datagram Mode Flows
All routers have to contain a minimal set of pre-setup
Datagram Mode flows (or be prepared to set these flows on demand) to certain
routers, which are at critical places in the abstraction hierarchy.
These flows are used to carry Datagram mode packets through the network. It
is purely a local decision which, if any, of those flows to set up before
there is an actual traffic requirement for them.
There is a minimum set of flows which do have to be able to be set up
for the system to operate, however.
Datagram Mode Flow Optimization
If the system keeps more than the minimal set of DMF's, and the router
allows longest-match lookups (e.g., in much the same way as the current
routing table for hop-by-hop datagrams), routing can be more efficient.
I.e. packets will no longer travel along strictly hierarchical paths,
but will "short-cut" closer to ultimate destinations. (When this happens,
the pointer will advance more than one level at a time in the locators.)
Traffic monitoring and analysis (again, using purely local algorithms) can
result in a database being created over time, which shows which DMF's above
and beyond the minimal set are worth keeping around.
This traffic monitoring could also show which flows from the required minimal
set of DMF's it would be useful to set up in advance of actual traffic which
needed them.
Note that all these sets can be changed in a local, incremental way, without
disturbing the operation of the system as a whole.
Datagram Mode Efficiency and Robustness
The forwarding of Datagram mode packets can be quite efficient (possibly more
so than even standard hop-by-hop):
- In the non-active routers, the packet is associated with a flow.
- In active routers, the process of looking up the next DMF would be
about as expensive as the current routing table lookup.
It can easily be seen that the process guarantees that the resulting path
is loop-free:
- Each flow selected must necessarily get the packet closer to its
destination.
- The flows themselves are guaranteed not to loop when their paths are
selected, prior to being set up.
Multicast Support
Nimrod approaches multicast with the same ideas used elsewhere: try and break
the problem up into pieces, and put as much of the functionality as possible
outside the architecture, to allow flexibility in algorithms, etc.
This last is especially important for multicast, where group sizes can vary
by many orders of magnitude.
Nimrod separates several distinct phases of creating a multicast group:
- Determining membership
- Deciding what kind of data distribution you want (per source, or
whatever)
- Calculating one or more spanning trees which connect the members
- Installing the state about those spanning trees in the routers
- Forwarding user data
Nimrod provides some very fundamental multicast building block(s), such as
multicast flow setup. All the rest (like the mechanisms to maintain the state
about group membership, calculate the spanning tree, etc) are outside the
core architecture. Users can select locally whichever algorithm makes sense
in their particular application.
This has all the advantages of the Nimrod approach to unicast:
- Makes it a simpler design
- Makes it less likely there will be something wrong
- Allow experiments with new algorithms
- Allows incremental deployment of new algorithms
Nimrod will also distinguish between a multicast group (i.e. a name for
the set of members) and a particular multicast flow (i.e. a particular
data distribution channel to that group). There can be multiple
flows which go to the same group, but controlled independently.
Flow Aggregation
Flow aggregation:
- Provides no extra capability to the users
- Adds complexity
So, why add it?
- Because it is needed to allow a positive economy of scale in high-speed
switches, since the unit (i.e. per-flow) cost of per-flow memory is higher
there.
- Flow aggregation also has the nice property that it works well with
virtual links.
Note that you don't have to have a lot of aggregation, just enough to
keep the unit cost of per-flow state memory declining at an appropriate rate.
Other Topics
There are many other topics which are not covered here. They include:
- Virtual links and routers, and recursive instantiation
- Renumbering
- Abstraction and advertisement
- Interaction with Resource Allocation
- Advanced topics in flow aggregation, such as asymmetric aggregation,
and multicast flow aggregation
- Mobility
- Topology database exchange
- Nimrod entities
- Flow setup
- Protocols
- Deployment plans
Back to the Nimrod Documentation page