Cisco DevNet Evolving Technologies
Cisco DevNet Evolving Technologies
Cisco DevNet Evolving Technologies
Study Guide
Nicholas Russo — CCIE #42518 (EI/SP) CCDE #20160041
1
Abstract
Nicholas Russo holds active CCIE certifications in Enterprise Infrastructure and Service Provider, as
well as CCDE. Nick authored a comprehensive study guide for the CCIE Service Provider version 4
examination and this document provides updates to the written test for all CCIE/CCDE tracks. Nick also
holds a Bachelor’s of Science in Computer Science, from the Rochester Institute of Technology (RIT) and
is a frequent programmer in the field of network automation. Nick lives in Maryland, USA with his wife,
Carla, and daughters, Olivia and Josephine. For updates to this document and Nick’s other professional
publications, please follow the author on his Twitter, LinkedIn, and personal website.
Technical Reviewers: Angelos Vassiliou, Leonid Danilov, and many from the RouterGods team.
This material is not sponsored or endorsed by Cisco Systems, Inc. Cisco, Cisco Systems, CCIE and
the CCIE Logo are trademarks of Cisco Systems, Inc. and its affiliates. All Cisco products, features, or
technologies mentioned in this document are trademarks of Cisco. This includes, but is not limited to,
Cisco IOS, Cisco IOS-XE, Cisco IOS-XR, and Cisco DevNet. The information herein is provided on an
“as is” basis, without any warranties or representations, express, implied or statutory, including without
limitation, warranties of noninfringement, merchantability or fitness for a particular purpose.
Author’s Notes
This book was originally designed for the CCIE and CCDE certification tracks that introduced the “Evolving
Technologies” section of the blueprint for the written qualification exam. Those exams have since been
overhauled and many of their topics have been moved uner the umbrella of Cisco DevNet. This book is not
specific to any certification track and provides an overview of the three key evolving technologies: Cloud,
Network Programmability, and Internet of Things (IoT). Italic text represents cited text from another not
created by the author. This is often directly from a Cisco document, which is appropriate given that this
is a summary of Cisco’s vision on the topics therein. This book is not an official publication and does not
have an ISBN assigned. The book will always be free. The opinions expressed in this study guide and
its corresponding documentation belong to the author and do not necessarily represent those of Cisco. My
only request is that you not distribute this book yourself. Please direct your friends and colleagues to my
website where they can download it for free.
I wrote this book because I believe that free and open-source software is the way of the future. So too do I
believe that the manner in which this book is published represents the future of publishing. I hope this book
serves its obviously utility as a technical reference, but also as an inspiration for others to meaningfully
contribute to the open-source community.
2 Network Programmability 79
2.1 Data models and structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.1.1 YANG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.1.2 YAML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.1.3 JSON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.1.4 XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
2.2 Device programmability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
2.2.1 Google Remote Procedure Call (gRPC) on IOS-XR . . . . . . . . . . . . . . . . . . . 85
2.2.2 Python paramiko Library on IOS-XE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
2.2.3 Python netmiko Library on IOS-XE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
2.2.4 NETCONF using netconf-console on IOS-XE . . . . . . . . . . . . . . . . . . . . . . . 95
2.2.5 NETCONF using Python and jinja2 on IOS-XE . . . . . . . . . . . . . . . . . . . . . . 99
2.2.6 REST API on IOS-XE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
2.2.7 RESTCONF on IOS-XE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
2.3 Controller based network design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
2.4 Configuration management tools and version control systems . . . . . . . . . . . . . . . . . . 112
2.4.1 Agent-based Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
2.4.2 Agent-less Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
2.4.3 Agent-less Demonstration with Ansible (SSH/CLI) . . . . . . . . . . . . . . . . . . . . 114
2.4.4 NETCONF-based Infrastructure as Code with Ansible . . . . . . . . . . . . . . . . . . 117
2.4.5 RESTCONF-based Infrastructure as Code with Ansible . . . . . . . . . . . . . . . . . 121
2.4.6 Agent-less Demonstration with Nornir . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
2.4.7 Version Control Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
2.4.8 Git with Github . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
2.4.9 Git with AWS CodeCommit and CodeBuild . . . . . . . . . . . . . . . . . . . . . . . . 135
2.4.10 Subversion (SVN) and comparison to Git . . . . . . . . . . . . . . . . . . . . . . . . . 143
List of Figures
1 Public Cloud High Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Private Cloud High Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Virtual Private Cloud High Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Connecting Cloud via Private WAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5 Connecting Cloud via IXP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6 Connecting Cloud via Internet VPN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
7 Comparing Virtual Machines and Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8 Viptela SD-WAN High Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
9 Viptela Home Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
10 Viptela Node Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
11 Viptela Event Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
12 Viptela Flow Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
13 Viptela VoIP QoS Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
14 Cisco ACI SD-DC High Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
15 Cisco NFVIS Home Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
16 Cisco NFVIS Image Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
17 Cisco NFVIS Image Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
18 Cisco NFVIS Topology Builder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
19 Cisco NFVIS Log Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
20 DNA-C Home Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
21 DNA-C Geographic View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
22 DNA-C Network Setings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
23 DNA-C Network Profile for VNFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
List of Tables
1 Cloud Design Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2 Cloud Security Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 NFV Advantages and Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5 Git and SVN Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6 IoT Transport Protocol Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7 IoT Data Aggregation Protocol Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
8 Commercial Cloud Provider Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
9 Software Development Methodology Comparison . . . . . . . . . . . . . . . . . . . . . . . . . 184
2. Private: Like the joke above, this model is like an on-premises DC except it must supply the three
key ingredients identified by Cisco to be considered a “private cloud”. Specifically, this implies au-
tomation/orchestration, workload mobility, and compartmentalization must all be supported in an on-
premises DC to qualify. The organization is responsible for maintaining the cloud’s physical equip-
ment, which is extended to include the automation and provisioning systems. This can increase OPEX
as it requires trained staff. Like the on-premises DC, private clouds provide application services to a
given organization and multi-tenancy is generally limited to business units or projects/programs within
that organization (as opposed to external customers). The diagram that follows illustrates a high-level
example of a private cloud.
3. Virtual Private: A virtual private cloud is a combination of public and private clouds. An organization
may decide to use this to offload some (but not all) of its DC resources into the public cloud, while
retaining some things in-house. This can be seen as a phased migration to public cloud, or by some
skeptics, as a non-committal trial. This allows a business to objectively assess whether the cloud
is the “right business decision”. This option is a bit complex as it may require moving workloads
between public/private clouds on a regular basis. At the very minimum, there is the initial private-to-
public migration; this could be time consuming, challenging, and expensive. This design is sometimes
called a “hybrid cloud” and could, in fact, represent a business’ IT end-state. The diagram that follows
4. Inter-cloud: Like the Internet (an interconnection of various autonomous systems provide reachability
between all attached networks), Cisco suggests that, in the future, the contiguity of cloud computing
may extend between many third-party organizations. This is effectively how the Internet works; a
customer signs a contract with a given service provider (SP) yet has access to resources from several
thousand other service providers on the Internet. The same concept could be applied to cloud and
this is an active area of research for Cisco.
Below is a based-on-a-true-story discussion that highlights some of the decisions and constraints relating
to cloud deployments.
1. An organization decides to retain their existing on-premises DC for legal/compliance reasons. By
adding automation/orchestration and multi-tenancy components, they are able to quickly increase
and decrease virtual capacity. Multiple business units or supported organizations are free to adjust
their security policy requirements within the shared DC in a manner that is secure and invisible to
other tenants; this is the result of compartmentalization within the cloud architecture. This deployment
would qualify as a “private cloud”.
2. Years later, the same organization decides to keep their most important data on-premises to meet
seemingly-inflexible Government regulatory requirements, yet feels that migrating a portion of their
private cloud to the public cloud is a solution to reduce OPEX long term. This increases the scalability
of the systems for which the Government does not regulate, such as virtualized network components
or identity services, as the on-premises DC is bound by CAPEX reductions. The private cloud footprint
can now be reduced as it is used only for a subset of tightly controlled systems, while the more generic
platforms can be hosted from a cloud provider at lower cost. Note that actually exchanging/migrating
workloads between the two clouds at will is not appropriate for this organization as they are simply
trying to outsource capacity to reduce cost. As discussed earlier, this deployment could be considered
a “virtual private cloud” by Cisco, but is also commonly referred to as a “hybrid cloud”.
3. Years later still, this organization considers a full migration to the public cloud. Perhaps this is made
possible by the relaxation of the existing Government regulations or by the new security enhance-
2. Internet Exchange Point (IXP): A customer’s network is connected via the IXP LAN (might be a
LAN/VLAN segment or a layer-2 overlay) into the cloud provider’s network. The IXP network is gen-
erally access-like and connects different organizations together so that they can peer with Border
Gateway Protocol (BGP) directly, but typically does not provide transit services between sites like a
private WAN. Some describe an IXP as a “bandwidth bazaar” or “bandwidth marketplace” where such
exchanges can happen in a local area. A strict SLA may not be guaranteed but performance would be
expected to be better than the Internet VPN. This is likewise an acceptable choice for virtual private
(hybrid) cloud but lacks the tight SLA typically offered in private WAN deployments. A company could,
for example, use internet VPNs for inter-site traffic and an IXP for public cloud access. A private WAN
for inter-site access is also acceptable.
3. Internet VPN: By far the most common deployment, a customer creates a secure VPN over the
Internet (could be multipoint if outstations require direct access as well) to the cloud provider. It
is simple and cost effective, both from a WAN perspective and DC perspective, but offers no SLA
whatsoever. Although suitable for most customers, it is likely to be the most inconsistently performing
option. While broadband Internet connectivity is much cheaper than private WAN bandwidth (in terms
of price per Mbps), the quality is often lower. Whether this is “better” is debatable and depends on the
business drivers. Also note that Internet VPNs, even high bandwidth ones, offer no latency guarantees
at all. This option is best for fully public cloud solutions since the majority of traffic transiting this VPN
tunnel should be user service flows. The solution is likely to be a poor choice for virtual private clouds,
especially if workloads are distributed between the private and public clouds. The biggest drawback
of the Internet VPN access design is that slow cloud performance as a result of the “Internet” is
something a company cannot influence; buying more bandwidth is the only feasible solution. In this
example, the branches don’t have direct Internet access (but they could), so they rely on an existing
private WAN to reach the cloud service provider.
This book does not detail the full Docker installation on CentOS because it is already well-documented and
not relevant to learning about containers. Once Docker has been installed, run the following verification
commands to ensure it is functioning correctly. Any modern version of Docker is sufficient to follow the
example that will be discussed.
[centos@docker build]$ which docker && docker --version
/usr/bin/docker
Docker version 17.09.1-ce, build 19e2cf6
Begin by running a new CentOS7 container. These images are stored on DockerHub and are automatically
downloaded when they are not locally present. For example, this machine has not run any containers yet,
and no images have been explicitly downloaded. Thus, Docker is smart enough to pull the proper image
from DockerHub and spin up a new container. This only takes a few seconds on a high-speed Internet
connection. Once complete, Docker drops the user into a new shell as the root user inside the container.
The -i and -t options enable an interactive TTY session, respectively, which is great for demonstrations.
Note that running Docker containers in the background is much more common as there are typically many
containers.
[centos@docker build]$ docker container run -it centos:7
Unable to find image 'centos:7' locally
7: Pulling from library/centos
[root@088bbd2a7544 /]#
To verify that the correct container was downloaded, run the following command. Then, exit from the
container, as the only use for CentOS7 in our example is to serve as a “base” image for the custom Ansible
image to be created.
[root@088bbd2a7544 /]# cat /etc/redhat-release
CentOS Linux release 7.4.1708 (Core)
To build a custom image, one creates a Dockerfile. It is a plain text file that closely resembles a shell script
and is designed to procedurally assemble the required components of a container image for use later. The
author already created a Dockerfile using a CentOS7 image as a basic image and added some additional
features to it. Every step has been commented for clarity.
Dockerfiles are typically written to minimize the both number of “layers” and amount of build time. Each
instruction generally qualifies as a layer. The more complex and less variable layers should be placed
towards the top of the Dockerfile, making them deeper layers. For example, installing key packages and
cloning the code necessary for the containers primary purpose occurs early. Layers that are more likely
to change, such as version-specific Ansible environment setup parameters, can come later. This way, if
the Ansible environment changes and the image needs to be rebuilt, only the layers at or after the point
of modification must be rebuilt. The base CentOS7 image and original yum package installations remain
unchanged, substantially reducing the image build time. Fewer RUN directives also results in fewer layers,
which explains the extensive use of && and \ in the Dockerfile.
[centos@docker build]$ cat Dockerfile
# Start from CentOS 7 base image.
FROM centos:7
The Dockerfile is effectively a set of instructions used to build a custom image. To build the image based on
the Dockerfile, issue the command below. The -t option specifies a tag, and in this case, cmd_authz is used
since this particular Dockerfile is using a specific branch from a specific Ansible developer’s personal Github
page. It would be unwise to call this simple ansible or ansible:latest due to the very specific nature of
this container and subsequent test. Because the user is in the same directory as the Dockerfile, specify the .
to choose the current directory. Each of the 5 steps in the Dockerfile (FROM, RUN, RUN, WORKDIR, HEALTHCHECK)
are logged in the output below. The output looks almost identical to what one would see through stdout.
[centos@docker build]$ docker image build -t ansible:cmd_authz .
Sending build context to Docker daemon 7.168kB
Step 1/5 : FROM centos:7
---> e934aafc2206
Step 2/5 : RUN yum update -y && yum install -y git [snip]
Loaded plugins: fastestmirror, ovl
Determining fastest mirrors
* base: mirrors.lga7.us.voxel.net
* extras: repo1.ash.innoscale.net
* updates: repos-va.psychz.net
Resolving Dependencies
--> Running transaction check
---> Package acl.x86_64 0:2.2.51-12.el7 will be updated
[snip, many more packages]
Complete!
Loaded plugins: fastestmirror, ovl
Cleaning repos: base extras updates
Cleaning up everything
Cleaning up list of fastest mirrors
Done!
Once complete, there will be a new image in the image list. Note that there are not any new containers,
since this image has not been run yet. It is ready to be instantiated as a container, or even pushed up
to DockerHub for others to use. Last, note that the container more than doubled in size. Because many
new packages were added for specific purposes, this makes the container less portable. Smaller is always
better, especially for generic images.
[centos@docker build]$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
ansible cmd_authz a8a6ac1b44e2 2 minutes ago 524MB
centos 7 e934aafc2206 7 weeks ago 199MB
For additional detail about this image, the following command returns extensive data in JSON format.
Docker uses a technique called layering whereby each command in a Dockerfile is a layer, and making
changes later in the Dockerfile won’t affect the lower layers. This is why the things least likely to change
should be placed towards the top, such as the base image, common package installs, etc. This reduces
image building time when Dockerfiles are changed.
[centos@docker build]$ docker image inspect a8a6ac1b44e2 | head -5
[
{
"Id": "sha256:a8a6ac1b44e28f654572bfc57761aabb5a92019c[snip]",
"RepoTags": [
"ansible:cmd_authz"
To run a container, use the same command shown earlier to start the CentOS7 container. Specify the image
name and in less than second, the new container is 100% operational. Ansible should be installed on this
container as part of the image creation process, so be sure to test this. Running the “setup” module on the
control machine (the container itself) should yield several lines of JSON output about the device itself. Note
that, towards the bottom of this output dump, ansible is aware that it is inside a Docker container.
[centos@docker build]$ docker container run -it ansible:cmd_authz
[root@04eb3ee71a52 examples]# which ansible && ansible -m setup localhost
Before running this playbook, a few Ansible adjustments are needed. First, adjust the ansible.cfg file to use
the hosts.yml inventory file and disable host key checking. Ansible needs to know which network devices
are in its inventory and how to handle unknown SSH keys.
[root@04eb3ee71a52 examples]# head -20 ansible.cfg
[snip, comments]
[defaults]
inventory = hosts.yml
host_key_checking = False
TASK [IOS >> Run show command from config mode] **************
changed: [csr1.njrusmc.net]
After exiting this container, check the list of containers again. Now, there were 2 containers in the past, the
newest one at the top. This was the Ansible container we just exited after completing our test. Again, some
output has been truncated to make the table fit neatly.
[centos@docker build]$ docker container ls -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
04eb3ee71a52 ans:cmd_authz "/bin/bash" 33 m ago Exited (127) 7 s ago adoring_mestorf
088bbd2a7544 centos:7 "/bin/bash" 43 m ago Exited (0) 42 m ago wise_banach
This manual “start and stop” approach to containerization has several drawbacks. Two are listed below:
1. To retest this solution, the playbook would have to be created again, and the Ansible environment files
(ansible.cfg, hosts.yml) would need to be updated again. Because containers are ephemeral,
this information is not stored automatically.
2. The commands are difficult to remember and it can be a lot to type, especially when starting many
containers. Since containers were designed for microservices and expected to be deployed in depen-
dent groups, this management strategy scales poorly.
Docker includes a feature called docker-compose. Using YAML syntax, developers can specify all the
containers they want to start, along with any minor options for those containers, then execute the compose
file like a script. It is better than a shell script since it is more portable and easier to read. It is also an easy
way to add volumes to Docker. There are different kinds of volumes, but in short, volumes allow persistent
data to be passed into and retrieved from containers. In this example, a simple directory mapping (known
as a “bind mount” in Docker) is built from the local mnt_files/ folder to the container’s file system. In this
folder, one can copy the Ansible files (issue31575.yml, ansible.cfg, hosts.yml) so the container has
immediate access. While it is possible to handle volume mounting from the commands viewed previously,
it is tedious and complex.
# docker-compose.yml
version: '3.2'
services:
ansible:
image: ansible:cmd_authz
hostname: cmd_authz
# Next two lines are equivalent of -i and -t, respectively
stdin_open: true
tty: true
volumes:
- type: bind
source: ./mnt_files
target: /ansible/examples/mnt_files
The contents of these files was shown earlier, but ensure they are all placed in the mnt_files/ directory
with relation to where the docker-compose.yml file is located.
[centos@docker compose]$ tree --charset=ascii
.
|-- docker-compose.yml
`-- mnt_files
|-- ansible.cfg
|-- hosts.yml
`-- issue31575.yml
To run the docker-compose file, use the command below. It will build containers for all keys specified under
The command below says “execute, on the ansible container, the bash command” which grants shell ac-
cess. Ensure that the mnt_files/ directory exists and contains all the necessary files. Copy the contents
to the current directly, which will overwrite the basic ansible.cfg and hosts.yml files provided by Ansible.
[centos@docker compose]$ docker-compose exec ansible bash
[root@cmd_authz examples]# tree mnt_files/ --charset=ascii
mnt_files/
|-- ansible.cfg
|-- hosts.yml
`-- issue31575.yml
Run the playbook again, and observe the same results as before. Now, assuming that this issue remains
open for a long period of time, docker-compose helps reduce the test setup time.
[root@cmd_authz examples]# ansible-playbook issue31575.yml
Exit from the container and check the container list again. Notice that, despite exiting, the container con-
tinues to run. This is because docker-compose created the container in a detached state, meaning the
absence of the shell does not cause the container to stop. Manually stop the container using the com-
mands below. Note that only the first few characters of the container ID can be used for these operations.
[centos@docker compose]$ docker container ls -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c16452e2a6b4 ansible:cmd_authz "/bin/bash" 12 m ago Up 10 m (health: ...) compose_ansible_1
04eb3ee71a52 ansible:cmd_authz "/bin/bash" 2 h ago Exited (127) 2 h ago adoring_mestorf
088bbd2a7544 centos:7 "/bin/bash" 2 h ago Exited (0) 2 h ago wise_banach
When Ansible network modules such as ios_command and ios_config were introduced, they required
provider dictionaries to log into network devices. This dictionary wrapped basic login information such
as hostname/IP address, username, password, and timeouts into a single dictionary object. While this
technique was brilliant for its day, the Ansible team acknowledged that this made network devices “different”
and having a unified SSH access method would be a better long-term solution. These features were
introduced in Ansible 2.5, but suppose you wrote all your playbooks in Ansible 2.4. How could you safely
run two versions of Ansible on a single machine to perform the necessary refactoring? Python virtual
environments (venv for short) are a good solution to this problem.
First, create a new venv for Ansible 2.4.2 to demonstrate the now-deprecated provider dictionary method.
The command below creates a new directory called ansible242/ and populates it with many files needed
to create a separate development environment. This book does not explore the inner workings of venv, but
does include a link in the references section.
[ec2-user@devbox venv]$ virtualenv ansible242
New python executable in /home/ec2-user/venv/ansible242/bin/python2
Also creating executable in /home/ec2-user/venv/ansible242/bin/python
Installing setuptools, pip, wheel...done.
[ec2-user@devbox venv]$ ls -l
total 0
drwxrwxr-x. 5 ec2-user ec2-user 82 Aug 22 07:06 ansible242
The purpose of venv is to create a virtual Python workspace, so any Python utilities and libraries should
be used within the venv. To activate the venv, use the source command to update your current shell. The
prompt changes to show the venv name at the far left. Use which to reveal that the pip binary has been
selected from within the venv.
[ec2-user@devbox venv]$ which pip
/usr/bin/pip
Install the correct version of Ansible using pip, and then check the site-packages within the venv to see that
Ansible 2.4.2 has been installed.
(ansible242) [ec2-user@devbox ansible242]$ pip install ansible==2.4.2.0
Collecting ansible==2.4.2.0
Collecting cryptography (from ansible==2.4.2.0)
[snip, many packages]
Successfully installed MarkupSafe-1.0 PyYAML-3.13 ansible-2.4.2.0 [snip]
The venv now has a functional Ansible 2.4.2 environment where playbook development can begin. This
demonstration shows a simple login playbook that the author has used in production just to SSH into all
devices. It’s the Cisco IOS equivalent of the Ansible ping module which is used primarily for testing SSH
reachability to Linux hosts. The source code is shown below. Note that there are only two variables defined.
The first tells Ansible which Python binary to use to ensure the proper libraries are used. A fully qualified file
name must be used as shortcuts like ~ are not allowed. The second variable is a nested login credentials
dictionary.
(ansible242) [ec2-user@devbox login]$ tree --charset=ascii
.
|-- group_vars
| `-- routers.yml
|-- inv.yml
`-- login.yml
---
# group_vars/routers.yml
ansible_python_interpreter: "/home/ec2-user/venv/ansible242/bin/python"
login_creds:
host: "{{ inventory_hostname }}"
username: "ansible"
password: "ansible"
...
---
# login.yml
- name: "Login to all routers"
hosts: routers
connection: local
gather_facts: false
tasks:
- name: "Run 'show clock' command"
ios_command:
provider: "{{ login_creds }}"
commands: "show clock"
...
Running the playbook with the custom inventory (containing one router called csr1) and verbosity enabled
so the CLI output is printed to standard output.
(ansible242)[ec2-user@devbox login]$ ansible-playbook login.yml -i inv.yml -v
Using /etc/ansible/ansible.cfg as config file
STDOUT:
To refactor this playbook from the old provider-style login to the new network_cli login, create a second
venv alongside the existing one. It is is named ansible263 which is the current version of Ansible at the
time of this writing. The steps are shown below but are not explained in detail as they were in the first
example.
[ec2-user@devbox venv]$ virtualenv ansible263
New python executable in /home/ec2-user/venv/ansible263/bin/python2
Also creating executable in /home/ec2-user/venv/ansible263/bin/python
Installing setuptools, pip, wheel...done.
\begin{minted}{text}
(ansible263) [ec2-user@devbox login]$ ansible --version
ansible 2.6.3
Ansible playbook development can begin now, and to save some time, recursively copy the login playbook
from the old venv into the new one. Because Python virtual environments are really just separate directory
structures, moving source code between them is easy. It is worth noting that source code does not have
to exist inside a venv. It may exist in one specific location and the refactoring effort could be done on a
version control feature branch. In this way, multiple venvs could access a common code base. In this
simple example, code is copied between venvs.
(ansible263) [ec2-user@devbox ansible263]$ cp -R ../ansible242/login/ .
(ansible263) [ec2-user@devbox ansible263]$ tree login/ --charset=ascii
login/
|-- group_vars
| `-- routers.yml
|-- inv.yml
`-- login.yml
Modify the group variables and playbook files according to the code shown below. Rather than define
a custom dictionary with login credentials, one can specify some values for the well-known Ansible login
parameters. At the playbook, the connection changes from local to network_cli and the inclusion of the
provider key under ios_command is no longer needed. Last, note that the Python interpreter path is updated
for this specific venv using the directory ansible263/.
---
# group_vars/routers.yml
ansible_python_interpreter: "/home/ec2-user/venv/ansible263/bin/python"
ansible_network_os: "ios"
ansible_user: "ansible"
ansible_ssh_pass: "ansible"
...
---
# login.yml
- name: "Login to all routers"
hosts: routers
connection: network_cli
gather_facts: false
tasks:
- name: "Run 'show clock' command"
ios_command:
commands: "show clock"
...
Running this playbook should yield the exact same behavior as the original playbook except modernized
for the new version of Ansible. Using virtual environments to accomplish this simplifies library and binary
executable management when testing multiple versions.
(ansible263)[ec2-user@devbox login]$ ansible-playbook login.yml -i inv.yml -v
STDOUT:
1.7 Connectivity
Network virtualization is often misunderstood as being something as simple as “virtualize this device using
a hypervisor and extend some VLANs to the host”. Network virtualization is really referring to the creation of
virtual topologies using a variety of technologies to achieve a given business goal. Sometimes these virtual
topologies are overlays, sometimes they are forms of multiplexing, and sometimes they are a combination
of the two. Here are some common examples (not a complete list) of network virtualization using well-
known technologies. Before discussing specific technical topics like virtual switches and SDN, it is worth
discussing basic virtualization techniques upon which all of these solutions rely.
1. Ethernet VLANs using 802.1q encapsulation. Often used to create virtual networks at layer 2
for security segmentation, traffic hair pinning through a service chain, etc. This is a form of data
multiplexing over Ethernet links. It isn’t a tunnel/overlay since the layer 2 reachability information
(MAC address) remains exposed and used for forwarding decisions.
2. VPN Routing and Forwarding (VRF) tables or other layer-3 virtualization techniques. Similar
uses as VLANs except virtualizes an entire routing instance, and is often used to solve a similar set of
problems. Can be combined with VLANs to provide a complete virtual network between layers 2 and
3. Can be coupled with GRE for longer-range virtualization solutions over a core network that may or
may not have any kind of virtualization. This is a multiplexing technique as well but is control-plane
only since there is no change to the packets on the wire, nor is there any inherent encapsulation (not
an overlay).
3. Frame Relay DLCI encapsulation. Like a VLAN, creates segmentation at layer 2 which might be
useful for last-mile access circuits between PE and CE for service multiplexing. The same is true
for Ethernet VLANs when using EV services such as EV-LINE, EV-LAN, and EV-TREE. This is a
data-plane multiplexing technique specific to Frame Relay.
4. MPLS VPNs. Different VPN customers, whether at layer 2 or layer 3, are kept completely isolated by
being placed in a separate virtual overlay across a common core that has no/little native virtualization.
This is an example of an overlay type of virtual network.
5. Virtual eXtensible Area Network (VXLAN). Just like MPLS VPNs; creates virtual overlays atop a
potentially non-virtualized core. VXLAN is a MAC-in-IP/UDP tunneling encapsulation designed to
provide layer-2 mobility across a data center fabric with an IP-based underlay network. The advantage
is that the large layer-2 domain, while it still exists, is limited to the edges of the network, not the
core. VXLAN by itself uses a “flood and learn” strategy so that the layer-2 edge devices can learn
the MAC addresses from remote edge devices, much like classic Ethernet switching. This is not a
good solution for large fabrics where layer-2 mobility is required, so VXLAN can be paired with BGP’s
Ethernet VPN (EVPN) address family to provide MAC routing between endpoints. Being UDP-based,
the VXLAN source ports can be varied per flow to provide better underlay (core IP transport) load
The remainder of this section walks through a high-level demonstration of the Viptela SD-WAN solution’s
various interfaces. Upon login to vManage, the centralized and multi-tenant management system, the user
is presented with a comprehensive dashboard. At its most basic, the dashboard alerts the administrator to
any obvious issues, such as sites being down or other errors needing repair.
Clicking on the vEdge number “4”, one can explore the status of the four remote sites. While not particularly
interesting in a network where everything is working, this provides additional details about the sites in the
network, and is a good place to start troubleshooting when issues arise.
Next, the administrator can investigate a specific node in greater detail to identify any faults recorded in the
event log. The screenshot on the following page is from SDWAN4, which provides a visual representation
of the current events and the text details in one screen.
The screenshot below depicts the bandwidth consumed between different hosts on the network. More
granular details such as ports, protocols, and IP addresses are available between the different monitoring
options from the left-hand pane. This screenshot provides output from the “Flows” option on the SDWAN4
node, which is a physical vEdge-100m appliance.
Last, the solution allows for granular flow-based policy control, similar to traditional policy-based routing,
except centrally controlled and fully dynamic. The screenshot below shows a policy to match DSCP 46,
typically used for expedited forwarding of inelastic, interactive VOIP traffic. The preferred color (preferred
link in this case) is the direct Ethernet-based Internet connection this particular node has. Not shown
is the backup 4G LTE link this vEdge-100m node also has. This link is slower, higher latency, and less
preferable for voice transport, so we administratively prefer the wireline Internet link. Not shown is the SLA
configuration and other policy parameters to specify the voice performance characteristics that must be
met. For example: 150 ms one way latency, less than 0.1% packet loss, and less than 30 ms jitter. If the
wireline Internet exceeds any of these thresholds, the vSmart controllers with automatically start using the
4G LTE link, assuming that its performance is within the SLA’s specification.
For those interested in replicating this demonstration, please visit Cisco dCloud. Note that the com-
pute/storage requirements for these Cisco SD-WAN components is very low, making it easy to run almost
anywhere. The only exception is the vManage component and its VM requirements can be found here. The
VMs can be run either on VMware ESXi or Linux KVM-based hypervisors (which includes Cisco NFVIS
discussed later in this book).
Advantage/Benefit Disadvantage/Challenge
Faster rollout of value added services Likely to observe decreased performance
Reduced CAPEX and OPEX Scalability exists only in purely NFV environment
Less reliance on vendor hardware refresh cycle Interoperability between different VNFs
Mutually beneficial to SDN (complementary) Mgmt and orchestration alongside legacy systems
NFV infrastructure (NFVi) encompasses all of the NFV related components, such as virtualized network
functions (VNF), management, orchestration, and the underlying hardware devices (compute, network, and
storage). That is, the totality of all components within an NFV system can be considered an NFVI instance.
Suppose a large service provider is interested in NFVI in order to reduce time to market for new services
while concurrently reducing operating costs. Each regional POP could be outfitted with a “mini data center”
consisting of NFIV components. Some call this an NFVI POP, which would house VNFs for customers within
the region it serves. It would typically be centrally managed by the service provider’s NOC, along with all of
the other NFVI POPs deployed within the network. The amalgamation of these NFVI POPs are parts of an
organization’s overall NFVI design.
The VNF repository can store VNF images for rapid deployment. Each image can also be instantiated many
times with different settings, know as profiles. For example, the system depicted in the screenshots below
has two images: Cisco ASAv firewall and the Viptela vEdge SD-WAN router.
The Cisco ASAv, for example, has multiple performance tiers based on scale. The ASAv5 is better suited
to small branch sites with the larger files being able to process more concurrent flows, remote-access VPN
clients, and other processing/memory intensive activities. The NFVIS hypervisor can store many different
“flavors” of a single VNF to allow for rapidly upgrading a VNF’s performance capabilities as the organization’s
IT needs grow.
Once the images with their corresponding profiles have been created, each item can be dragged-and-
dropped onto a topology canvas to create a virtual network or service chain. Each LAN network icon is
effectively a virtual switch (a VLAN), connecting virtual NICs on different VNFs together to form the correct
flow path. On many other hypervisors, the administrator needs to manually build this connectivity as VMs
come and go, or possibly script it. With NFVIS, the intuitive GUI makes it easier for network operators to
adjust the switched topology of the intra-NFVIS network.
Note that the bottom of the screen has some ports identified as single root input/output virtualization (SR-
IOV). These are high-performance connection points for specific VNFs to bypass the hypervisor-managed
internal switching infrastructure and connect directly to Peripheral Component Interconnect express (PCIe)
resources. This improves performance and is especially useful for high bandwidth use cases.
Last, NFVIS provides local logging management for all events on the hypervisor. This is particularly useful
for remote sites where WAN outages separate the NFVIS from the headend logging servers. The on-box
logging and its ease of navigation can simplify troubleshooting during or after an outage.
After clicking on the Design option, the main design screen displays a geographic map of the network in the
Network Hierarchy view. In this small network, the region of Aberdeen has two sites within it, Site200 and
Site300. Each of these sites has a Cisco ENCS 5412 platform running NFVIS 3.8.1-FC3; they represent
large branch sites. Additional sites can be added manually or imported from a comma-separated values
(CSV) file. Each of the other subsections is worth a brief discussion:
1. Network Settings: This is where the administrator defines basic network options such as IP address
pools, QoS settings, and integration with wireless technologies.
2. Image Repository: The inventory of all images, virtual and physical, that are used in the network.
Multiple flavors of an image can be stored, with one marked as the “golden image” that DNA-C will
ensure is running on the corresponding network devices.
3. Network Profiles: These network profiles bind the specific VNF instances to a network hierarchy,
serving as network-based intent instructions for DNA-C. A profile can be applied globally, regionally,
or to a site. In this demonstration, the “Routing & NFV” profile is used, but DNA-C also supports a
“Switching” profile and a “Wireless” profile, both of which simplify SDA operations.
4. Auth Template: These templates enable faster IEEE 802.1X configuration. The 3 main options include
closed authentication (strict mode), easy connect (low impact mode), and open authentication (anyone
can connect). Administrators can add their own port-based authentication profiles here for more
granularity. Since 802.1X is not used in this demonstration, this particular option is not discussed
further.
Additionally, the Network Profiles tab is particularly interesting for this demonstration as VNFs are being
provisioned on remote ENCS platforms running NFVIS. On a global, regional, or per site basis, the ad-
ministrator can identify which VNFs should run on which NFVIS-enabled sites. For example, sites in one
region may only have access to high-latency WAN transport, and thus could benefit from WAN optimization
VNFs. Such an expense may not be required in other regions where all transports are relatively low-latency.
The screenshot below shows an example. Note the similarities with the NFVIS drag-and-drop GUI; in this
solution, the administrator checks boxes on the left hand side of the screen to add or remove VNFs. The
virtual networking between VNFs is defined elsewhere in the profile and is not discussed in detail here.
After configuring all of the network settings, administrators can populate their Image Repository. This
contains a list of all virtual and physical images currently loaded onto DNA-C. There are two screenshots
below. The first shows the physical platform images, in this case, the NFVIS hypervisor. Appliance software,
such as a router IOS image, could also appear here. The second screenshot shows the virtual network
functions (VNFs) that are present in DNA-C. In this example, there is a Viptela vEdge SD-WAN router and
ASAv image.
After completing all of the design steps (for brevity, several were not discussed in detail here), navigate
back to the main screen and explore the Policy section. The policy section is SDA-focused and provides
After applying any SDA-related security policies into the network, it’s time to provision the VNFs on the
remote ENCS platforms running NFVIS. The screenshot below targets site 200. For the initial day 0
configuration bootstrapping, the administrator must tell DNA-C what the publicly-accessible IP address of
the remote NFVIS is. This management IP could change as the ENCS is placed behind NAT devices or in
different SP-provided DHCP pools. In this example, bogus IPs are used as an illustration.
Note that the screenshot is on the second step of the provisioning process. The first step just confirms the
network profile created earlier, which identifies the VNFs to be deployed at a specific level in the network
hierarchy (global, regional, or site). The third step allows the user to specific access port configuration, such
as VLAN membership and interface descriptions. The summary tab gives the administrator a review of the
provisioning process before deployment.
The screenshot that follows shows a log of the provisioning process. This gives the administrator confidence
that all the necessary steps were completed, and also provides a mechanism for troubleshooting any issues
that arise. Serial numbers and public IP addresses are masked for security.
In summary, DNA-C is a powerful tool that unifies network design, SDA policy application, and VNF provi-
sioning across an enterprise environment.
A node is a worker machine in Kubernetes, which can be physical or virtual. Where the pods are com-
ponents of a deployment/application, nodes are components of a cluster. Although an administrator can
just “create” nodes in Kubernetes, this creation is just a representation of a node. The usability/health of
a node depends on whether the Kubernetes master can communicate with the node. Because nodes can
be virtual platforms and hostnames can be DNS-resolvable, the definition of these nodes can be portable
between physical infrastructures.
A cluster is a collection of nodes that are capable of running pods, deployments, replica sets, etc. The
Kubernetes master is a special type of node which facilitates communications within the cluster. It is re-
sponsible for scheduling pods onto nodes and responding to events within the cluster. A node-down event,
for example, would require the master to reschedule pods running on that node elsewhere.
A service is concept used to group pods of similar functionality together. For example, many database
containers contain content for a web application. The database group could be scaled up or down (i.e.
they change often), and the application servers must target the correct database containers to read/write
data. The service often has a label, such as “database”, which would also exist on pods. Whenever the
web application communicates to the service over TCP/IP, the service communicates to any pod with the
“database” tag. Services could include node-specific ports, which is a simple port forwarding mechanism
to access pods on a node. Advanced load balancing services are also available but are not discussed in
detail in this book.
Labels are an important Kubernetes concept and warrant further discussion. Almost any resource in Ku-
bernetes can carry a collection of labels, which is a key/value pair. For example, consider the blue/green
deployment model for an organization. This architecture has two identical production-capable software
instances (blue and green), and one is in production while the other is upgraded/changed. Using JSON
syntax, one set of pods (or perhaps an entire deployment) might be labeled as {"color": "blue"} while
the other is {"color": "green"}. The key of “color” is the same so the administrator can query for “color”
label to get the value, and then make a decision based on that. One Cisco engineer described labels as
flexible and extensible source of metadata. They can reference releases of code, locations, or any sort of
logical groupings. There is no limitation of how many labels can be applied. In this way, labels are similar
to tags in Ansible which can be used to pick-and-choose certain tasks to execute or skip, depending.
The minikube solution provides a relatively easy way to get started with Kubernetes. It is a VM that can run
on Linux, Windows, or Mac OS using a variety of underlying hypervisors. It represents a tiny Kubernetes
Starting minikube is as easy as the command below. Check the status of the Kubernetes cluster to ensure
there are no errors. Note that a local IP address is allocated to minikube to support outside-in access to
pods and the cluster dashboard.
Nicholass-MBP:localkube nicholasrusso$ minikube start
Starting local Kubernetes v1.10.0 cluster...
Starting VM...
Getting VM IP address...
Moving files into cluster...
Setting up certs...
Connecting to cluster...
Setting up kubeconfig...
Starting cluster components...
Kubectl is now configured to use the cluster.
Loading cached images from config file.
As discussed earlier, there is a variety of port exposing techniques. The “NodePort” option allows outside
access into the deployment using TCP port 8080 which was defined when the deployment was created.
Nicholass-MBP:localkube nicholasrusso$ kubectl expose deployment \
> hello-minikube --type=NodePort
service "hello-minikube" exposed
Check the pod status quickly to see that the pod is still in a state of creating the container. A few seconds
later, the pod is operational.
Nicholass-MBP:localkube nicholasrusso$ kubectl get pod
NAME READY STATUS RESTARTS AGE
hello-minikube-c8b6b4fdc-nz5nc 0/1 ContainerCreating 0 17s
Viewing the network services, Kubernetes reports which resources are reachable using which IP/port com-
binations. Actually reaching these IP addresses may be impossible depending on how the VM is set up on
your local machine, and considering minikube is not meant for production, it isn’t a big deal.
Nicholass-MBP:localkube nicholasrusso$ kubectl get service
NAME TYPE CLUSTER-IP XTERNAL-IP PORT(S) AGE
hello-minikube NodePort 10.98.210.206 <none> 8080:31980/TCP 15s
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 7h
Next, we will scale the application by increasing the replica sets (rs) from 1 to 2. Replica sets, as discussed
earlier, are copies of pods typically used to add capacity to an application in an automated and easy way.
Kubernetes has built-in support for load balancing to replica sets as well.
Nicholass-MBP:localkube nicholasrusso$ kubectl get rs
NAME DESIRED CURRENT READY AGE
hello-minikube-c8b6b4fdc 1 1 1 1m
The command below creates a replica of the original pod, resulting in two total pods.
Nicholass-MBP:localkube nicholasrusso$ kubectl scale \
> deployments/hello-minikube --replicas=2
deployment.extensions "hello-minikube" scaled
Get the pod information to see the new replica up and running. Theoretically, the capacity of this application
has been doubled and can now handle twice the workload (again, assuming load balancing has been set
up and the application operates in such a way where this is useful).
Nicholass-MBP:localkube nicholasrusso$ kubectl get pod
NAME READY STATUS RESTARTS AGE
hello-minikube-c8b6b4fdc-l5jgn 1/1 Running 0 6s
hello-minikube-c8b6b4fdc-nz5nc 1/1 Running 0 1m
The minikube cluster comes with a GUI interface accessible via HTTP. The Kubernetes web dashboard
can be quickly verified from the shell. First, you can see the URL using the command below, then feed the
output from this command into curl to issue an HTTP GET request.
Nicholass-MBP:localkube nicholasrusso$ minikube service hello-minikube --url
http://192.168.99.100:31980
SERVER VALUES:
server_version=nginx: 1.10.0 - lua: 10001
HEADERS RECEIVED:
accept=*/*
host=192.168.99.100:31980
user-agent=curl/7.43.0
BODY:
-no body in request-
The screenshot below shows the overview dashboard of Kubernetes, focusing on the number of pods that
are deployed. At present, there is 1 deployment called hello-minikube which has 2 total pods.
We can scale the application further from the GUI by increasing the replicas from 2 to 3. On the far right
of the deployments window, click the three vertical dots, then scale. Enter the number of replicas desired.
The screenshot below shows the prompt window. The screen reminds the user that there are currently 2
pods, but we desire 3 now.
After scaling this application, the dashboard changes to show new pods being added in the diagram that
follows. After a few seconds, the dashboard reflects 3 healthy pods (not shown for brevity). During this
state, the third replica set is still being initialized and is not available for workload processing yet.
Scrolling down further in the dashboard, the individual pods and replica sets are listed. This is similar to the
output displayed earlier from the kubectl get pods command.
Checking the CLI again, the new replica set (ending in cxxlg) created from the dashboard appears here.
Nicholass-MBP:localkube nicholasrusso$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-minikube-c8b6b4fdc-cxxlg 1/1 Running 0 21s
hello-minikube-c8b6b4fdc-l5jgn 1/1 Running 0 8m
Next, generate specific programmatic credentials for the “terraform” user. The access key is used by AWS
to communicate the username and other unique data about your AWS account, and the secret key is a
password that should not be shared.
Once the new “terraform” user exists in the proper group with the proper permissions and a valid access
key, run aws configure from the shell. The aws binary can be installed via Python pip, but if you are like
the author and are using an EC2 instance to run the AWS CLI, it comes pre-installed on Amazon Linux.
Simply answer the questions as they appear, and always copy/paste the access and secret keys to avoid
typos. Choose a region near you and use “json” for the output format, which is the most programmatically
appropriate answer.
[ec2-user@devbox ~]$ aws configure
AWS Access Key ID [None]: AKIAJKRONVDHHQ3GJYGA
AWS Secret Access Key [None]: [hidden]
Default region name [None]: us-east-1
Default output format [None]: json
To quickly test whether AWS CLI is set up correctly, use the command below. Be sure to match up the Arn
number and username to what is shown in the screenshots above.
[ec2-user@devbox ~]$ aws sts get-caller-identity
{
"Account": "043535020805",
"UserId": "AIDAINLWE2QY3Q3U6EVF4",
"Arn": "arn:aws:iam::043535020805:user/terraform"
}
The goal of this short demonstration is to deploy a Cisco CSR1000v into the default VPC within the avail-
ability zone us-east-1a. Building out a whole new virtual environment using the AWS CLI manually is not
Armed with the VPC ID from above, ask for the subnets available in this VPC. By default, every AZ within
this region has a default subnet, but since this demonstration is focused on us-east-1a, we can apply some
filters. First, we filter subnets only contained in the default VPC, then additionally only on the us-east-1a AZ
subnets. One subnet is returned with SubnetId of subnet-f1dfa694.
[ec2-user@devbox ~]$ aws ec2 describe-subnets --filters \
> 'Name=vpc-id,Values=vpc-889b03ee' 'Name=availability-zone,Values=us-east-1a'
Armed with the proper subnet for the CSR1000v, an Amazon Machine Image (AMI) must be identified
to deploy. Since there are many flavors of CSR1000v available, such as bring your own license (BYOL),
maximum performance, and security, apply a filter to target the specific image desired. The example below
shows a name-based filter searching for a string containing 16.09 as the version followed later by BYOL, the
lowest cost option. Record the ImageId, which is ami-0d1e6af4c329efd82, as this is the image to deploy.
Note: Cisco images require the user to accept the terms of a license agreement before usage. One must
navigate to the following page first, subscribe, and accept the terms prior to attempting to start this instance
or launch will result in an error. Visit this link for details.
[ec2-user@devbox ~]$ aws ec2 describe-images --filters \
> 'Name=name,Values=cisco-CSR-.16.09*BYOL*'
{
"Images": [
{
"ProductCodes": [
{
"ProductCodeId": "5tiyrfb5tasxk9gmnab39b843",
"ProductCodeType": "marketplace"
}
],
"Description": "cisco-CSR-trhardy-20180727122305.16.09.01-BYOL-HVM",
"VirtualizationType": "hvm",
"Hypervisor": "xen",
"ImageOwnerAlias": "aws-marketplace",
"EnaSupport": true,
"SriovNetSupport": "simple",
"ImageId": "ami-0d1e6af4c329efd82",
"State": "available",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/xvda",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "standard",
"VolumeSize": 8,
"SnapshotId": "snap-010a7ddb206eb016e"
}
}
],
"Architecture": "x86_64",
Next, capture the available security groups and choose one. Be sure to filter on the default VPC to avoid
cluttering output with any Ansible VPC related security groups. The default security group, in this case, is
wide open and permits all traffic. The GroupId of sg-4d3a5c31 can be used when deploying the CSR1000v.
[ec2-user@devbox ~]$ aws ec2 describe-security-groups --filter \
> 'Name=vpc-id,Values=vpc-889b03ee'
{
"SecurityGroups": [
{
"IpPermissionsEgress": [
{
"IpProtocol": "-1",
"PrefixListIds": [],
"IpRanges": [
{
"CidrIp": "0.0.0.0/0"
}
],
"UserIdGroupPairs": [],
"Ipv6Ranges": []
}
],
"Description": "default VPC security group",
"IpPermissions": [
{
"IpProtocol": "-1",
"PrefixListIds": [],
"IpRanges": [
{
"CidrIp": "0.0.0.0/0"
}
],
"UserIdGroupPairs": [],
With all the key information collected, use the command below with the appropriate inputs to create the new
EC2 instance. After running the command, a string is returned with the instance ID of the new instance;
this is why the --query argument is handy when deploying new instances using AWS CLI. The CSR1000v
will take a few minutes to fully power up.
[ec2-user@devbox ~]$ aws ec2 run-instances --image-id ami-0d1e6af4c329efd82 \
> --subnet-id subnet-f1dfa694 \
> --security-group-ids sg-4d3a5c31 \
> --count 1 \
> --instance-type t2.medium \
> --key-name EC2-key-pair \
> --query "Instances[0].InstanceId"
"i-08808ba7abf0d2242"
In the meantime, collect information about the instance using the command below. Use the --instance-ids
option to supply a list of strings, each containing a specific instance ID. The value returned above is pasted
below. The status is still “initializing”.
[ec2-user@devbox ~]$ aws ec2 describe-instance-status --instance-ids 'i-08808ba7abf0d2242'
{
"InstanceStatuses": [
{
"InstanceId": "i-08808ba7abf0d2242",
"InstanceState": {
"Code": 16,
"Name": "running"
},
"AvailabilityZone": "us-east-1a",
"SystemStatus": {
"Status": "ok",
"Details": [
{
"Status": "passed",
"Name": "reachability"
}
]
},
"InstanceStatus": {
"Status": "initializing",
"Details": [
{
"Status": "initializing",
"Name": "reachability"
}
]
}
}
]
In order to connect to the instance to configure it, the public IP or public DNS hostname is required. The
command below targets this specific information without a massive JSON dump. Simply feed in the instance
ID. Without the complex query, one could manually scan the JSON to find the address, but this solution is
more targeted and elegant.
[ec2-user@devbox ~]$ aws ec2 describe-instances \
> --instance-ids i-08808ba7abf0d2242 --output text \
> --query 'Reservations[*].Instances[*].PublicIpAddress'
34.201.13.127
Assuming your private key is already present with the proper permissions (read-only for owner), SSH into
the instance using the newly-discovered public IP address. A quick check of the IOS XE version suggests
that the deployment succeeded.
[ec2-user@devbox ~]$ ls -l privkey.pem
-r-------- 1 ec2-user ec2-user 1670 Jan 1 16:54 privkey.pem
Termination is simple as well. The only challenge is that, generally, one would have to rediscover the
instance ID assuming the termination happened long after the instance was created. The alternative is
manually writing some kind of shell script to store that data in a file, which must be manually read back in to
delete the instance. The next section on Terraform helps overcome these state problems in a simple way,
but for now, simply delete the CSR1000v using the command below. The JSON output confirms that the
instance is shutting down.
[ec2-user@devbox ~]$ aws ec2 terminate-instances --instance-ids i-08808ba7abf0d2242
{
"TerminatingInstances": [
{
"InstanceId": "i-08808ba7abf0d2242",
"CurrentState": {
"Code": 32,
"Name": "shutting-down"
},
"PreviousState": {
"Code": 16,
"Name": "running"
This CurrentState of shutting-down will remain for a few minutes until the instance is gone. Running the
command again confirms the instance no longer exists as the state is terminated.
[ec2-user@devbox ~]$ aws ec2 terminate-instances --instance-ids i-08808ba7abf0d2242
{
"TerminatingInstances": [
{
"InstanceId": "i-08808ba7abf0d2242",
"CurrentState": {
"Code": 48,
"Name": "terminated"
},
"PreviousState": {
"Code": 48,
"Name": "terminated"
}
}
]
}
[ec2-user@devbox ~]$ ls -l
-rw-rw-r-- 1 ec2-user ec2-user 20971661 Dec 14 21:21 terraform_0.11.11_linux_amd64.zip
Unzip the package to reveal a single binary. At this point, Terraform operators have 3 options:
1. Move the binary to a directory in your PATH. This is the author’s preferred choice and what is done
below.
2. Add the current directory (where the terraform binary exists) to the shell PATH.
3. Prefix the binary with ./ every time you want to use it.
[ec2-user@devbox ~]$ unzip terraform_0.11.11_linux_amd64.zip
Archive: terraform_0.11.11_linux_amd64.zip
inflating: terraform
Last, the author recommends creating a directory for this particular Terraform project as shown below.
Change into that directly and create a new text file called “network.tf”. Open the file in your favorite editor
to begin creating the Terraform plan.
[ec2-user@devbox ~]$ mkdir tf-demo && cd tf-demo
[ec2-user@devbox tf-demo]$
Next, use the aws_vpc resource to create a new VPC. The documentation suggests that only the cidr_block
argument is required. The author suggests adding a Name tag to help organize resources as well. Note that
there is a large list of “attribute” fields on the documentation page. These are the pieces of data returned
by Terraform, such as the VPD ID and Amazon Resource Name (ARN). These are dynamically allocated at
runtime and referencing these values can simply the Terraform plan later.
# Create a new VPC for DMZ services
resource "aws_vpc" "tfvpc" {
cidr_block = "203.0.113.0/24"
tags = {
Name = "tfvpc"
}
}
Next, use the aws_subnet resource to create a new IP subnet. The documentation indicates that cidr_block
and vpc_id arguments are needed. The former is self-explanatory as it represents a subnet within the VPC
network of 203.0.113.0/24; this demonstration uses 203.0.113.64/26. The VPC ID is returned from the
aws_vpc resource and can be referenced using the ${} syntax shown below. The name tfvpc has an
attribute called id that identifies the VPC in which this new subnet should be created. Like the aws_vpc
resource, aws_subnet also returns an ID which can be referenced later when creating EC2 instances.
# Create subnet within the new VPC for the DMZ
resource "aws_subnet" "dmz" {
vpc_id = "${aws_vpc.tfvpc.id}"
cidr_block = "203.0.113.64/26"
availability_zone = "us-east-1a"
tags = {
Name = "dmz"
}
}
Now that the basic network constructs have been configured, its time to add EC2 instances to construct the
DMZ. One could just add a few more resource invocations to the existing network.tf file. For variety, the
author is going to create a second file for the EC2 compute devices. When multiple *.tf configuration files
exist, they are loaded in alphabetical order, but that’s largely irrelevant since Terraform is smart enough to
create/destroy resources in the appropriate sequence regardless of the file names.
Edit a file called “services.tf” in your favorite text editor and apply the following configuration to deploy a
Cisco ASAv and CSR1000v within the us-east-1a AZ. The AMI for the CSR1000v is the same one used in
the AWS CLI demonstration. The AMI for the ASAv is the BYOL version, which was derived using the AWS
CLI describe-instances. Both instances are placed in the newly created subnet within the newly created
VPC, keeping everything separate from any existing AWS resources. Just like with the CSR1000v images,
Cisco requires the user to accept the terms of a license agreement before usage. One must navigate the
the following page first, subscribe, and accept the terms prior to attempting to start this instance or launch
will result in an error. Visit this link for details.
# Cisco ASAv BYOL
resource "aws_instance" "dmz_asav" {
ami = "ami-4fbf3c30"
instance_type = "m4.large"
Once the Terraform plan files have been configured, use terraform init. This scans all the plan files for
any required plugins. In this case, the AWS provider is needed given the types of resource invocations
present. To keep the initial Terraform binary small, individual provider plugins are not included and are
downloaded as-needed. Like most good tools, Terraform is very verbose and provides hints and help along
the way. The output below represents a successful setup.
[ec2-user@devbox tf-demo]$ terraform init
To prevent automatic upgrades to new major versions that may contain breaking
changes, it is recommended to add version = "..." constraints to the
corresponding provider blocks in configuration, with the constraint strings
suggested below.
------------------------------------------------------------------------
+ aws_instance.dmz_csr1000v
id: <computed>
ami: "ami-0d1e6af4c329efd82"
arn: <computed>
associate_public_ip_address: <computed>
availability_zone: <computed>
cpu_core_count: <computed>
cpu_threads_per_core: <computed>
ebs_block_device.#: <computed>
ephemeral_block_device.#: <computed>
get_password_data: "false"
host_id: <computed>
instance_state: <computed>
instance_type: "t2.medium"
ipv6_address_count: <computed>
ipv6_addresses.#: <computed>
key_name: <computed>
network_interface.#: <computed>
network_interface_id: <computed>
password_data: <computed>
placement_group: <computed>
primary_network_interface_id: <computed>
private_dns: <computed>
+ aws_subnet.dmz
id: <computed>
arn: <computed>
assign_ipv6_address_on_creation: "false"
availability_zone: "us-east-1a"
availability_zone_id: <computed>
cidr_block: "203.0.113.64/26"
ipv6_cidr_block: <computed>
ipv6_cidr_block_association_id: <computed>
map_public_ip_on_launch: "false"
owner_id: <computed>
tags.%: "1"
tags.Name: "dmz"
vpc_id: "${aws_vpc.tfvpc.id}"
+ aws_vpc.tfvpc
id: <computed>
arn: <computed>
assign_generated_ipv6_cidr_block: "false"
cidr_block: "203.0.113.0/24"
default_network_acl_id: <computed>
default_route_table_id: <computed>
default_security_group_id: <computed>
dhcp_options_id: <computed>
enable_classiclink: <computed>
enable_classiclink_dns_support: <computed>
enable_dns_hostnames: <computed>
enable_dns_support: "true"
instance_tenancy: "default"
ipv6_association_id: <computed>
ipv6_cidr_block: <computed>
main_route_table_id: <computed>
owner_id: <computed>
tags.%: "1"
tags.Name: "tfvpc"
------------------------------------------------------------------------
Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.
Running the command again and specifying an optional output file allows the plan to be saved to disk.
+ aws_instance.dmz_csr1000v
[snip]
+ aws_subnet.dmz
[snip]
+ aws_vpc.tfvpc
[snip]
------------------------------------------------------------------------
Quickly checking the subnet details in the AWS console confirm that the subnet is in the correct VPC, AZ,
and has the right IPv4 CIDR range.
Going back to Terraform, notice that a new terraform.tfstate file has been created. This represents the
new infrastructure state after the Terraform plan was applied. Use terraform show to view the file, which
contains all the computed fields filled in, such as the ARN value.
[ec2-user@devbox tf-demo]$ ls -l
total 28
-rw-rw-r-- 1 ec2-user ec2-user 533 Jan 1 18:54 network.tf
-rw-rw-r-- 1 ec2-user ec2-user 7437 Jan 1 19:00 plan.tfstate
-rw-rw-r-- 1 ec2-user ec2-user 417 Jan 1 18:59 services.tf
-rw-rw-r-- 1 ec2-user ec2-user 10917 Jan 1 19:01 terraform.tfstate
Running terraform plan again provides a diff-like report on what changes need to be made to the in-
frastructure to implement the plan. Since no new changes have been made manually to the environment
(outside of Terraform), no updates are needed.
[ec2-user@devbox tf-demo]$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.
------------------------------------------------------------------------
This means that Terraform did not detect any differences between your
configuration and real physical resources that exist. As a result, no
actions need to be performed.
Suppose a clumsy user accidentally deletes the CSR1000v as shown below. Wait for the instance to be
terminated.
[ec2-user@devbox tf-demo]$ aws ec2 terminate-instances \
Using terraform plan now detects a change and suggests needing to add 1 more resource to the infras-
tructure make the intended plan a reality. Simple use terraform apply to update the infrastructure and
answer yes to confirm. Note that you cannot simply rerun plan.tfstate because it was created against an
old state (ie, an old diff between intended and actual states).
[ec2-user@devbox tf-demo]$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.
------------------------------------------------------------------------
+ aws_instance.dmz_csr1000v
id: <computed>
ami: "ami-0d1e6af4c329efd82"
arn: <computed>
[snip]
+ aws_instance.dmz_csr1000v
id: <computed>
ami: "ami-0d1e6af4c329efd82"
arn: <computed>
[snip]
source_dest_check: "true"
subnet_id: "subnet-01461157fed507e7b"
tags.%: "1"
tags.Name: "dmz_csr1000v"
tenancy: <computed>
volume_tags.%: <computed>
vpc_security_group_ids.#: <computed>
aws_instance.dmz_csr1000v: Creating...
ami: "" => "ami-0d1e6af4c329efd82"
arn: "" => "<computed>"
[snip]
source_dest_check: "" => "true"
subnet_id: "" => "subnet-01461157fed507e7b"
tags.%: "" => "1"
tags.Name: "" => "dmz_csr1000v"
tenancy: "" => "<computed>"
volume_tags.%: "" => "<computed>"
vpc_security_group_ids.#: "" => "<computed>"
aws_instance.dmz_csr1000v: Still creating... (10s elapsed)
aws_instance.dmz_csr1000v: Still creating... (20s elapsed)
aws_instance.dmz_csr1000v: Still creating... (30s elapsed)
aws_instance.dmz_csr1000v: Creation complete after 32s (ID: i-05d5bb841cf4e2ad1)
The new instance is currently initializing, and Terraform plan says all is well.
[ec2-user@devbox tf-demo]$ aws ec2 describe-instance-status \
> --instance-ids 'i-05d5bb841cf4e2ad1' \
> --query InstanceStatuses[*].InstanceStatus.Status
[
"initializing"
]
[ec2-user@devbox tf-demo]$ terraform plan
[snip]
No changes. Infrastructure is up-to-date.
To cleanup, use terraform plan -destroy to view a plan to remove all of the resources added by Ter-
raform. This is a great way to ensure no residual AWS resources are left in place (and costing money) long
after they are needed.
[ec2-user@devbox tf-demo]$ terraform plan -destroy
------------------------------------------------------------------------
- aws_instance.dmz_asav
- aws_instance.dmz_csr1000v
- aws_subnet.dmz
- aws_vpc.tfvpc
The command above serves as a good preview into what terraform destroy will perform. Below, the
infrastructure is torn down in the reverse order it was created. Note that -auto-approve can be appended
to both apply and destroy actions to remove the interactive prompt asking for yes.
[ec2-user@devbox tf-demo]$ terraform destroy -auto-approve
aws_vpc.tfvpc: Refreshing state... (ID: vpc-0edde0f2f198451e1)
aws_subnet.dmz: Refreshing state... (ID: subnet-01461157fed507e7b)
aws_instance.dmz_csr1000v: Refreshing state... (ID: i-05d5bb841cf4e2ad1)
aws_instance.dmz_asav: Refreshing state... (ID: i-03ac772e458bb9282)
aws_instance.dmz_csr1000v: Destroying... (ID: i-05d5bb841cf4e2ad1)
aws_instance.dmz_asav: Destroying... (ID: i-03ac772e458bb9282)
aws_instance.dmz_asav: Still destroying... (ID: i-03ac772e458bb9282, 10s elapsed)
aws_instance.dmz_csr1000v: Still destroying... (ID: i-05d5bb841cf4e2ad1, 10s elapsed)
aws_instance.dmz_csr1000v: Still destroying... (ID: i-05d5bb841cf4e2ad1, 20s elapsed)
aws_instance.dmz_asav: Still destroying... (ID: i-03ac772e458bb9282, 20s elapsed)
aws_instance.dmz_asav: Still destroying... (ID: i-03ac772e458bb9282, 30s elapsed)
aws_instance.dmz_csr1000v: Still destroying... (ID: i-05d5bb841cf4e2ad1, 30s elapsed)
aws_instance.dmz_asav: Destruction complete after 40s
aws_instance.dmz_csr1000v: Still destroying... (ID: i-05d5bb841cf4e2ad1, 40s elapsed)
[snip, waiting for CSR1000v to terminate]
aws_instance.dmz_csr1000v: Still destroying... (ID: i-05d5bb841cf4e2ad1, 2m50s elapsed)
aws_instance.dmz_csr1000v: Destruction complete after 2m51s
aws_subnet.dmz: Destroying... (ID: subnet-01461157fed507e7b)
aws_subnet.dmz: Destruction complete after 1s
aws_vpc.tfvpc: Destroying... (ID: vpc-0edde0f2f198451e1)
aws_vpc.tfvpc: Destruction complete after 0s
Using terraform plan -destroy again says there is nothing left to destroy, indicating that everything has
been cleaned up. Further verification via AWS CLI or AWS console may be desirable, but for brevity, the
author excludes it here.
2.1.1 YANG
YANG defines how data is structured/modeled rather than containing data itself. Below is snippet from
RFC 6020 which defines YANG (section 4.2.2.1). The YANG model defines a “host-name” field as a string
(array of characters) with a human-readable description. Pairing YANG with NETCONF, the XML syntax
references the data field by its name to set a value.
YANG Example:
leaf host-name {
type string;
description "Hostname for this system";
}
NETCONF XML Example:
<host-name>my.example.com</host-name>
This section explores a YANG validation example using Cisco CSR1000v on modern “Everest” software.
This router is running as an EC2 instance inside AWS. Although the NETCONF router is not used until
later, it is important to check the software version to ensure we clone the right YANG models.
NETCONF_TEST#show version | include IOS_Software
Cisco IOS Software [Everest], Virtual XE Software (X86_64_LINUX_IOSD-UNIVERSALK9-M),
Version 16.6.1, RELEASE SOFTWARE (fc2)
YANG models for this particular version are publicly available on Github. Below, the repository is cloned
using SSH which captures all of the YANG models for all supported products, across all versions. The
repository is not particularly large, so cloning the entire thing is beneficial for future testing.
Nicholas-MBP:YANG nicholasrusso$ git clone [email protected]:YangModels/yang.git
Cloning into 'yang'...
remote: Counting objects: 10372, done.
remote: Compressing objects: 100% (241/241), done.
remote: Total 10372 (delta 74), reused 292 (delta 69), pack-reused 10062
Receiving objects: 100% (10372/10372), 19.95 MiB | 4.81 MiB/s, done.
Resolving deltas: 100% (6556/6556), done.
The YANG model itself is a C-style declaration of how data should be structured. The file is very long, and
the text below focuses on a few key EIGRP parameters. Specifically, the bandwidth-percent, hello-interval,
and hold-time. These are configured under the af-interface stanza within EIGRP named-mode. The af-
interface declaration is a list element with many leaf elements beneath it, which correspond to individual
configuration parameters.
Nicholass-MBP:1661 nicholasrusso$ cat Cisco-IOS-XE-eigrp.yang
module Cisco-IOS-XE-eigrp {
namespace "http://cisco.com/ns/yang/Cisco-IOS-XE-eigrp";
prefix ios-eigrp;
import ietf-inet-types {
prefix inet;
}
import Cisco-IOS-XE-types {
prefix ios-types;
}
import Cisco-IOS-XE-interface-common {
prefix ios-ifc;
}
// [snip]
// the lines that follow are under "router eigrp address-family"
grouping eigrp-address-family-grouping {
list af-interface {
description
"Enter Address Family interface configuration";
key "name";
leaf name {
type string;
}
leaf bandwidth-percent {
description
"Set percentage of bandwidth percentage limit";
type uint32 {
range "1..999999";
}
}
leaf hello-interval {
description
"Configures hello interval";
type uint16;
}
leaf hold-time {
description
Before exploring NETCONF, which will use this model to get/set configuration data on the router, this demon-
stration explores the pyang tool. This is a conversion tool to change YANG into different formats. The pyang
tool is available here. After extracting the archive, the tool is easily installed.
Nicholass-MBP:pyang-1.7.3 nicholasrusso$ python3 setup.py install
running install
running bdist_egg
running egg_info
writing top-level names to pyang.egg-info/top_level.txt
writing pyang.egg-info/PKG-INFO
writing dependency_links to pyang.egg-info/dependency_links.txt
[snip]
The most basic usage of the pyang tool is to validate valid YANG syntax. Beware that running the tool
against a YANG model in a different directory means that pyang considers the local directory (not the one
containing the YANG model) for the search point for any YANG module dependencies. Below, an error
occurs since pyang cannot find imported modules relevant for the EIGRP YANG model.
Nicholass-MBP:YANG nicholasrusso$ pyang yang/vendor/cisco/xe/1661/Cisco-IOS-XE-eigrp.yang
yang/vendor/cisco/xe/1661/Cisco-IOS-XE-eigrp.yang:5: error: module
"ietf-inet-types" not found in search path
yang/vendor/cisco/xe/1661/Cisco-IOS-XE-eigrp.yang:10: error: module
"Cisco-IOS-XE-types" not found in search path
[snip]
One could specify the module path using the --path option, but it is simpler to just navigate to the directory.
This allows pyang to see the imported data types such as those contained within ietf-inet-types. When
using pyang from this location, no output is returned, and the program exits successfully. It is usually a good
idea to validate YANG models before doing anything with them, especially committing them to a repository.
Nicholass-MBP:YANG nicholasrusso$ cd yang/vendor/cisco/xe/1661/
Nicholass-MBP:1661 nicholasrusso$ pyang Cisco-IOS-XE-eigrp.yang
Nicholass-MBP:1661 nicholasrusso$ echo $?
0
This confirms that the model has valid syntax. The pyang tool can also convert between different formats.
Below is a simple and lossless conversion of YANG syntax into XML. This YANG-to-XML format is known
as YIN, and pyang can generate pretty XML output based on the YANG model. This is an alternative way
to view, edit, or create data models. YIN format might be useful for Microsoft Powershell users. Powershell
makes XML parsing easy, and may not be as friendly to the YANG syntax.
Nicholass-MBP:1661 nicholasrusso$ pyang Cisco-IOS-XE-eigrp.yang \
> --format=yin --yin-pretty-strings
<?xml version="1.0" encoding="UTF-8"?>
<module name="Cisco-IOS-XE-eigrp"
xmlns="urn:ietf:params:xml:ns:yang:yin:1"
xmlns:ios-eigrp="http://cisco.com/ns/yang/Cisco-IOS-XE-eigrp"
xmlns:inet="urn:ietf:params:xml:ns:yang:ietf-inet-types"
xmlns:ios-types="http://cisco.com/ns/yang/Cisco-IOS-XE-types"
2.1.3 JSON
JavaScript Object Notation (JSON) is another data modeling language that is similar to YAML in concept. It
was designed to be simpler than traditional markup languages and uses key/value pairs to store information.
The “value” of a given pair can be another key/value pair, which enables hierarchical data nesting. The
key/value pair structure and syntax is very similar to the dict data type in Python. Like YAML, JSON is
also commonly used for maintaining configuration files or as a form of structured feedback from a query or
API call. The next page displays a syntax example of JSON which represents the same data and same
structure as the YAML example.
[
{
"process": "update static routes",
"vrf": "customer1",
"nexthop": "10.0.0.1",
"devices": [
{
"net": "192.168.0.0",
"mask": "255.255.0.0",
"state": "present"
},
{
"net": "172.16.0.0",
"mask": "255.240.0.0",
"state": "absent"
}
]
}
]
More discussion around YAML and JSON is warranted since these two formats are very commonly used
today. YAML is considered to be a strict (or proper) superset of JSON. That is, any JSON data can be rep-
resented in YAML, but not vice versa. This is important to know when converting back and forth; converting
JSON to YAML should always succeed, but converting YAML to JSON may fail or yield unintended results.
Below is a straightforward conversion between YAML and JSON without any hidden surprises.
Next, observe an example of some potentially unexpected conversion results. While the JSON result is
technically correct, it lacks the shorthand “anchoring” technique available in YAML. The anchor, for exam-
ple, creates information that can be inherited by other dictionaries later. While the information is identical
between these two blocks and has no functional difference, some of these YAML shortcuts are advanta-
geous for encouraging code/data reuse. Another difference is that YAML natively supports comments using
the hash symbol # while JSON does not natively support comments.
---
anchor: anchor&
name: "Nick"
age: 31
clone:
<<: *anchor
...
{
"anchor": {
"name": "Nick",
"age": 31
},
"clone": {
"name": "Nick",
"age": 31
}
}
YANG isn’t directly comparable with YAML, JSON, and XML because it solves a different problem. If any
one of these languages solved all of the problems, then the others would not exist. Understanding the
business drivers and the problems to be solved using these tools is the key to choosing to right one.
2.1.4 XML
Data structured in XML is very common and has been popular for decades. XML is very verbose and
explicit, relying on starting and ending tags to identify the size/scope of specific data fields. The next page
shows an example of XML code resembling a similar structure as the previous YAML and JSON examples.
Note that the topmost root wrapper key is needed in XML but not for YAML or JSON.
Next, install the Cisco IOS-XR specific libraries needed to communicate using gRPC. This could be bundled
into the previous step, but was separated in this document for cleanliness.
[root@devbox ec2-user]# pip install iosxr_grpc
Collecting iosxr_grpc
[snip]
Clone this useful gRPC client library, written by Karthik Kumaravel. It contains a number of wrapper func-
tions to simplify using gRPC for both production and learning purposes. Using the ls command, ensure the
ios-xr-grpc-python/ directory has files in it. This indicates a successful clone. More skilled developers
may skip this step and write custom Python code using the iosxr_grpc library directly.
[root@devbox ec2-user]# git clone \
> https://github.com/cisco-grpc-connection-libs/ios-xr-grpc-python.git
Cloning into 'ios-xr-grpc-python'...
remote: Counting objects: 419, done.
remote: Total 419 (delta 0), reused 0 (delta 0), pack-reused 419
Receiving objects: 100% (419/419), 99.68 KiB | 0 bytes/s, done.
Resolving deltas: 100% (219/219), done.
Install the pyang tool, which is a Python utility for managing YANG models. This same tool is used to
Before continuing, ensure you have a functional IOS-XR platform running version 6.0 or later. Log into the
IOS-XR platform via SSH and enable gRPC. It’s very simple and only requires identifying a TCP port on
which to listen. Additionally, TLS-based security options are available but omitted for this demonstration.
This IOS-XR platform is an XRv9000 running in AWS on version 6.3.1.
RP/0/RP0/CPU0:XRv_gRPC#show version
Cisco IOS XR Software, Version 6.3.1
Copyright (c) 2013-2017 by Cisco Systems, Inc.
Build Information:
Built By : ahoang
Built On : Wed Sep 13 18:30:01 PDT 2017
Build Host : iox-ucs-028
Workspace : /auto/srcarchive11/production/6.3.1/xrv9k/workspace
Version : 6.3.1
Location : /opt/cisco/XR/packages/
Once enabled, check the gRPC status and statistics, respectively, to ensure it is running. The TCP port is
10033 and TLS is disabled for this test. The statistics do not show any gRPC activity yet. This makes sense
since no API calls have been executed.
RP/0/RP0/CPU0:XRv_gRPC#show grpc status
Manually configure some OSPFv3 parameters via CLI to start. Below is a configuration snippet from the
IOS-XRv platform running gRPC.
RP/0/RP0/CPU0:XRv_gRPC#show running-config router ospfv3
router ospfv3 42518
router-id 10.10.10.2
log adjacency changes detail
area 0
interface Loopback0
passive
!
interface GigabitEthernet0/0/0/0
cost 1000
network point-to-point
hello-interval 1
!
!
address-family ipv6 unicast
Navigate to the examples/ directory inside of the cloned IOS-XR gRPC client utility. The cli.py utility can
Using the popular jq (JSON query) utility, one can pull out the OSPFv3 configuration from the file.
[root@devbox examples]# jq '.data."Cisco-IOS-XR-ipv6-ospfv3-cfg:ospfv3"' json/ospfv3.json
{
"processes": {
"process": [
{
"process-name": 42518,
"default-vrf": {
"router-id": "10.10.10.2",
"log-adjacency-changes": "detail",
"area-addresses": {
"area-area-id": [
{
"area-id": 0,
"enable": [
null
],
"interfaces": {
"interface": [
{
"interface-name": "Loopback0",
"enable": [
null
],
"passive": true
},
{
"interface-name": "GigabitEthernet0/0/0/0",
"enable": [
null
],
"cost": 1000,
"network": "point-to-point",
"hello-interval": 1
}
]
}
}
]
}
Using a text editor, manually update the merge.json file by adding the top-level key of “Cisco-IOS-XR-
ipv6-ospfv3-cfg:ospfv3” and changing some minor parameters. In the example below, the author updates
Gig0/0/0 cost, network type, and hello interval. Don’t forget the trailing } at the bottom of the file after adding
the top-level key discussed above or else the JSON data will be syntactically incorrect.
[root@devbox examples]# cat json/merge.json
{
"Cisco-IOS-XR-ipv6-ospfv3-cfg:ospfv3": {
"processes": {
"process": [
{
"process-name": 42518,
"default-vrf": {
"router-id": "10.10.10.2",
"log-adjacency-changes": "detail",
"area-addresses": {
"area-area-id": [
{
"area-id": 0,
"enable": [
null
],
"interfaces": {
"interface": [
{
"interface-name": "Loopback0",
"enable": [
null
],
"passive": true
},
{
"interface-name": "GigabitEthernet0/0/0/0",
"enable": [
null
],
"cost": 123,
"network": "broadcast",
"hello-interval": 17
}
Use the cli.py utility again except with the merge-config option. Specify the merge.json file as the
configuration delta to merge with the existing configuration. This API call does not return any output, but
checking the return code indicates it succeeded.
[root@devbox examples]# ./cli.py -i xrv_grpc -p 10033 -u root -pw grpctest \
> -r merge-config --file json/merge.json
\begin{minted}{text}
[root@devbox examples]# echo $?
0
Log into the IOS-XR platform again and confirm via CLI that the configuration was updated.
RP/0/RP0/CPU0:XRv_gRPC#sh run router ospfv3
router ospfv3 42518
router-id 10.10.10.2
log adjacency changes detail
area 0
interface Loopback0
passive
!
interface GigabitEthernet0/0/0/0
cost 123
network broadcast
hello-interval 17
!
!
address-family ipv6 unicast
The gRPC statistics are updated as well. The first get-config request came from the devbox and the
response was sent from the router. The same transactional communication is true for merge-config.
RP/0/RP0/CPU0:XRv_gRPC#show grpc statistics
*************************show gRPC statistics******************
---------------------------------------------------------------
show-cmd-txt-request-recv : 0
show-cmd-txt-response-sent : 0
get-config-request-recv : 1
get-config-response-sent : 1
cli-config-request-recv : 0
cli-config-response-sent : 0
get-oper-request-recv : 0
Below is the code for the demonstration. The comments included in-line help explain what is happening at
a basic level. The file is cisco_paramiko.py.
import time
import paramiko
def get_output(conn):
"""
Given an open connection, read all the data from the buffer and
decode the byte string as UTF-8.
"""
return conn.recv(65535).decode('utf-8')
def main():
"""
Execution starts here by creating an SSHClient object, assigning login
parameters, and opening a new shell via SSH.
"""
conn_params = paramiko.SSHClient()
conn = conn_params.invoke_shell()
print(f'Logged into {get_output(conn).strip()} successfully')
if __name__ == '__main__':
main()
Before running this code, examine the configuration of the router’s services. Notice that DHCP is explicitly
disabled while nagle and sequence-numbers are disabled by default.
CSR1000V#show running-config | include service
service timestamps debug datetime msec
service timestamps log datetime msec
no service dhcp
Run the script using the command below, which logs into the router, gathers some basic information, and
applies some configuration updates.
[ec2-user@devbox ~]$ python3 cisco_paramiko.py
Logged into CSR1000V# successfully
terminal length 0
CSR1000V#
show version
Cisco IOS XE Software, Version 16.09.01
Cisco IOS Software [Fuji], Virtual XE Software (X86_64_LINUX_IOSD-UNIVERSALK9-M),
Version 16.9.1, RELEASE SOFTWARE (fc2)
[version output truncated]
Configuration register is 0x2102
CSR1000V#
show inventory
NAME: "Chassis", DESCR: "Cisco CSR1000V Chassis"
PID: CSR1000V , VID: V00 , SN: 9CZ120O2S1L
CSR1000V#
Below is the code for the demonstration. Like the paramiko example, comments included in-line help explain
the steps. Notice that there is significantly less code, and the code that does exist is relatively simple and ab-
stract. The code accomplishes the same general tasks as the paramiko code. The file is cisco_netmiko.py.
from netmiko import ConnectHandler
def main():
"""
Execution starts here by creating a new connection with several
keyword arguments to log into the device.
"""
conn = ConnectHandler(device_type='cisco_ios', ip='172.31.31.144',
username='python', password='python')
# Run some exec commands and print the output, but don't need
# to define a custom function to send commands cleanly
commands = ['terminal length 0', 'show version', 'show inventory']
for command in commands:
print(conn.send_command(command))
if __name__ == '__main__':
main()
For completeness, below is a snippet of the services currently enabled. Just like in the paramiko example,
the three services we want to enable (DHCP, nagle, and sequence-numbers) are currently disabled.
Running the code, there is far less output since netmiko cleanly masks the shell prompt from being returned
with each command output, instead only returning the relevant/useful data.
[ec2-user@devbox ~]$ python3 cisco_netmiko.py
Logged into CSR1000V# successfully
After running this code, all three specified services in the services list are automatically configured with
minimal effort. Recall that service dhcp is enabled by default.
CSR1000V#show running-config | include service
service nagle
service timestamps debug datetime msec
service timestamps log datetime msec
service sequence-numbers
RFC6242 describes NETCONF over SSH and TCP port 830 has been assigned for this service. A quick
test of the ssh shell command on port 830 shows a successful connection with several lines of XML being
returned. Without understanding what this data means, the names of several YANG modules are returned,
including the EIGRP one of interest.
Nicholass-MBP:ssh nicholasrusso$ ssh -p 830 [email protected]
[email protected]'s password:
<?xml version="1.0" encoding="UTF-8"?>
<hello xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">
The netconf-console.py tool is a simple way to interface with network devices that use NETCONF. This is
the same tool used in the Cisco blog post mentioned earlier. Rather than specify basic SSH login information
as command line arguments, the author suggests editing these values in the Python code to avoid typos
while testing. These options begin around line 540 of the netconf-console.py file.
parser.add_option("-u", "--user", dest="username", default="nctest",
help="username")
parser.add_option("-p", "--password", dest="password", default="nctest",
help="password")
parser.add_option("--host", dest="host", default="netconf.njrusmc.net",
help="NETCONF agent hostname")
parser.add_option("--port", dest="port", default=830, type="int",
help="NETCONF agent SSH port")
Run the playbook using Python 2 (not Python 3, as the code is not syntactically compatible) with the
--hello option to collect the list of supported capabilities from the router. Verify that the EIGRP mod-
ule is present. This output is similar to the native SSH shell test from above except it is handled through the
netconf-console.py tool.
Nicholass-MBP:YANG nicholasrusso$ python netconf-console.py --hello
<?xml version="1.0" encoding="UTF-8"?>
<hello xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">
<capabilities>
<capability>urn:ietf:params:netconf:base:1.0</capability>
<capability>urn:ietf:params:netconf:base:1.1</capability>
<capability>urn:ietf:params:netconf:capability:writable-running:1.0</capability>
<capability>urn:ietf:params:netconf:capability:xpath:1.0</capability>
<capability>urn:ietf:params:netconf:capability:validate:1.0</capability>
<capability>urn:ietf:params:netconf:capability:validate:1.1</capability>
<capability>urn:ietf:params:netconf:capability:rollback-on-error:1.0</capability>
<capability>[snip, many capabilities here]</capability>
<capability>http://cisco.com/ns/yang/Cisco-IOS-XE-eigrp?module=Cisco-IOS-XE-eigrp&
revision=2017-02-07</capability>
</capabilities>
<session-id>26801</session-id>
</hello>
This device claims to support EIGRP configuration via NETCONF as verified above. To simplify the initial
configuration, an EIGRP snippet is provided below which adjusts the variables in scope for this test. These
are CLI commands and are unrelated to NETCONF.
# Applied to NETCONF_TEST router
router eigrp NCTEST
address-family ipv4 unicast autonomous-system 65001
af-interface GigabitEthernet1
bandwidth-percent 9
hello-interval 7
hold-time 8
Perform the get operation once more to ensure the value were updated correctly by NETCONF.
Nicholass-MBP:YANG nicholasrusso$ python netconf-console.py \
> --get-config -x "native/router/eigrp"
<?xml version="1.0" encoding="UTF-8"?>
<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="1">
<data>
<native xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-native">
<router>
<eigrp xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-eigrp">
<id>NCTEST</id>
<address-family>
<type>ipv4</type>
<af-ip-list>
<unicast-multicast>unicast</unicast-multicast>
<autonomous-system>65001</autonomous-system>
<af-interface>
<name>GigabitEthernet1</name>
<bandwidth-percent>19</bandwidth-percent>
<hello-interval>17</hello-interval>
<hold-time>18</hold-time>
</af-interface>
</af-ip-list>
</address-family>
</eigrp>
</router>
</native>
</data>
</rpc-reply>
Logging into the router’s shell via SSH as a final check, the configuration changes made by NETCONF were
retained. Additionally, a syslog message suggests that the configuration was updated by NETCONF, which
helps differentiate it from regular CLI changes.
%DMI-5-CONFIG_I: F0: nesd: Configured from NETCONF/RESTCONF by nctest, transaction-id 81647
def get_config(connection_params):
# Open connection using the parameter dictionary
with manager.connect(**connection_params) as connection:
config_xml = connection.get_config(source='running').data_xml
config = xmltodict.parse(config_xml)['data']
return config
def main():
# Login information for the router
connection_params = {
'host': '172.31.55.203',
'username': 'cisco',
'password': 'cisco',
'hostkey_verify': False,
}
if __name__ == '__main__':
main()
The file below is a jinja2 template file. Jinja2 is a text templating language commonly used with Python
applications and their derivative products, such as Ansible. It contains basic programming logic such as
conditionals, iteration, and variable substitution. By substituting variables into an XML template, the output
is a data structure that NETCONF can push to the devices. The variable fields have been highlighted to
show the relevant logic.
<config>
<native xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-native">
<interface>
{% for loopback in loopbacks %}
<Loopback>
<name>{{ loopback.number }}</name>
{% if loopback.description is defined %}
<description>{{ loopback.description }}</description>
{% endif %}
{% if loopback.ipv4_address is defined %}
<ip>
<address>
<primary>
<address>{{ loopback.ipv4_address }}</address>
<mask>{{ loopback.ipv4_mask }}</mask>
</primary>
</address>
</ip>
{% endif %}
</Loopback>
{% endfor %}
</interface>
</native>
</config>
Before running the code, verify that netconf-yang is configured as explained during the NETCONF console
demonstration, along with a privilege 15 user. The code above reveals that the demo username/password
is cisco/cisco. After running the code, the output below is printed to standard output. The author has
included the “top level keys” just to show a few other high level options available. Collecting information via
NETCONF is far superior to CLI-based screen scraping via regular expressions for text parsing.
[ec2-user@devbox]$ python3 pynetconf.py
SW version: 16.9
Hostname: CSR1000v
top level keys: ['@xmlns', 'version', 'boot-start-marker', 'boot-end-marker',
'service', 'platform', 'hostname', 'username', 'vrf', 'ip', 'interface',
'control-plane', 'logging', 'multilink', 'redundancy', 'spanning-tree',
Using basic show commands, verify that the two loopbacks were added successfully. The nested dictionary
above indicates that Loopback 42518 has a description defined by no IP addresses. Likewise, Loopback
53592 has an IPv4 address and subnet mask defined, but no description. The Jinja2 template supplied,
which generates the XML configuration to be pushed to the router, makes both of these parameters optional.
CSR1000v#show running-config interface Loopback42518
interface Loopback42518
description No IP on this one yet!
no ip address
Last, check the statistics to see the incoming NETCONF sessions and corresponding incoming remote
procedure calls (RPCs). This indicates that everything is working correctly.
CSR1000v#show netconf-yang statistics
netconf-start-time : 2018-12-09T01:04:44+00:00
in-rpcs : 8
in-bad-rpcs : 0
out-rpc-errors : 0
out-notifications : 0
in-sessions : 4
dropped-sessions : 0
in-bad-hellos : 0
First, the basic configuration to enable the REST API feature on IOS XE devices is shown below. A brief
verification shows that the feature is enabled and uses TCP port 55443 by default. This port number is
important later as the curl command will need to know it.
virtual-service csr_mgmt
ip shared host-interface GigabitEthernet1
activate
ip http secure-server
transport-map type persistent webui HTTPS_WEBUI
secure-server
transport type persistent webui input HTTPS_WEBUI
remote-management
restful-api
Using curl for IOS XE REST API invocations requires a number of options. Those options are summa-
rized next. They are also described in the manual pages for curl (use the man curl shell command).
This specific demonstration will be limited to obtaining an authentication token, posting a QoS class-map
configuration, and verifying that it was written.
Main argument: /api/v1/qos/class-map
v: verbose. Prints all debugging output which is useful for troubleshooting and learning.
H: Extra header needed to specify that JSON is being used. Every new POST
request must contain JSON in the body of the request. It is also used with
GET, POST, PUT, and DELETE requests after an authentication token has been obtained.
3: force curl to use SSLv3 for the transport to the managed device. This can
be detrimental and should be used cautiously (discussed later).
The first step is obtaining an authentication token. This allows the HTTPS client to supply authentication
credentials once, such as username/password, and then can use the token for authentication for all future
API calls. The initial attempt at obtaining this token fails. This is a common error so the troubleshooting to
resolve this issue is described in this document. The two HTTPS endpoints cannot communicate due to not
supporting the same cipher suites. Note that it is critical to specify the REST API port number (55443) in
the URL, otherwise the standard HTTPS server will respond on port 443 and the request will fail.
[root@ip-10-125-0-100 restapi]# curl -v \
The utility sslscan can help find the problem. The issue is that the CSR1000v only supports the TLSv1
versions of the ciphers, not the SSLv3 version. The curl command issued above forced curl to use SSLv3
with the -3 option as prescribed by the documentation. This is a minor error in the documentation which
has been reported and may be fixed at the time of your reading. This troubleshooting excursion is likely to
have value for those learning about REST APIs on IOS XE devices in a general sense, since establishing
HTTPS transport is a prerequisite.
[root@ip-10-125-0-100 ansible]# sslscan --version
sslscan version 1.10.2
OpenSSL 1.0.1e-fips 11 Feb 2013
Removing the -3 option will fix the issue. Using sslscan was still useful because, ignoring the RC4 cipher it-
self used with grep, one can note that the TLSv1 variant was accepted while the SSLv3 variant was rejected,
which would suggest a lack of support for SSLv3 ciphers. It appears that the TLS_DHE_RSA_WITH_AES_256_CBC_SHA
cipher was chosen for the connection when the curl command is issued again. Below is the correct output
from a successful curl.
[root@ip-10-125-0-100 restapi]# curl -v -X \
> POST https://csr1:55443/api/v1/auth/token-services \
> -H "Accept:application/json" -u "ansible:ansible" -d "" -k
The final step is using an HTTPS POST request to write new data to the router. One can embed the JSON
text as a single line into the curl command using the -d option. The command appears intimidating at a
glance. Note the single quotes (”) surrounding the JSON data with the -d option; these are required since
the keys and values inside the JSON structure have “double quotes”. Additionally, the username/password
is omitted from the request, and additional headers (-H) are applied to include the authentication token
string and the JSON content type.
[root@ip-10-125-0-100 restapi]# curl -v -H "Accept:application/json" \
> -H "X-Auth-Token: YGSBUtzTpfK2QumIEk8dt9rXhHjZfAJSZXYXDXg162Q=" \
> -H "content-type: application/json" -X POST https://csr1:55443/api/v1/qos/class-map
> -d '{"cmap-name": "CMAP_AF11","description": "QOS CLASS MAP FROM REST API CALL", \
> "match-criteria": {"dscp": [{"value": "af11","ip": false}]}}' -k
This newly-configured class-map can be verified using an HTTPS GET request. The data field is stripped to
the empty string, POST is changed to GET, and the class-map name is appended to the URL. The verbose
option (-v) is omitted for brevity. Writing this output to a file and using the jq utility can be a good way to
Logging into the router to verify the request via CLI is a good idea while learning, although using HTTPS
GET verified the same thing.
RTR_CSR1#show running-config class-map
[snip]
class-map match-all CMAP_AF11
description QOS CLASS MAP FROM REST API CALL
match dscp af11
end
Enabling RESTCONF requires a single hidden command in global configuration, shown below as simply
restconf. This feature is not TAC supported at the time of this writing and should be used for experimenta-
tion only. Additionally, a loopback interface with an IP address and description is configured. For simplicity,
RESTCONF testing will be limited to insecure HTTP to demonstrate the capability without dealing with
SSL/TLS ciphers.
DENALI#show running-config | include restconf
restconf
This section does not detail other HTTP operations such as POST, PUT, and DELETE using RESTCONF.
The feature is still very new and is tightly integrated with postman, a tool that generates HTTP requests
automatically.
2. Augmented: This model relies on a fully distributed control-plane by adding a centralized controller
that can apply policy to parts of the network at will. Such a controller could inject shorter-match IP
prefixes, policy-based routing (PBR), security features (ACL), or other policy objects. This model is a
good compromise between distributing intelligence between nodes to prevent singles points of failure
(which a controller introduces) by using a known-good distributed control-plane underneath. The
policy injection only happens when it “needs to”, such as offloading traffic from an overloaded link in
a DC fabric or traffic from a long-haul fiber link between two points of presence (POPs) in an SP core.
Cisco’s Performance Routing (PfR) is an example of the augmented model which uses the Master
Controller (MC) to push policy onto remote forwarding nodes. Another example includes offline path
computation element (PCE) servers for automated MPLS TE tunnel creation. In both cases, a small
set of routers (PfR border routers or TE tunnel head-ends) are modified, yet the remaining routers are
untouched. This model has a lower impact on the existing network because the wholesale failure of
the controller simply returns the network to the distributed model, which is a viable solution for many
businses cases. The diagram that follows depicts the augmented SDN model.
4. Centralized: This is the model most commonly referenced when the phrase “SDN” is used. It relies
on a single controller, which hosts the entire control-plane. Ultimately, this device commands all of
the devices in the forwarding-plane. These controllers push their forwarding tables with the proper
information (which doesn’t necessarily have to be an IP-based table, it could be anything) to the
forwarding hardware as specified by the administrators. This offers very granular control, in many
cases, of individual flows in the network. The hardware forwarders can be commoditized into white
boxes (or branded white boxes, sometimes called brite boxes) which are often inexpensive. Another
value proposition of centralizing the control-plane is that a “device” can be almost anything: router,
switch, firewall, load-balancer, etc. Emulating software functions on generic hardware platforms can
add flexibility to the business.
The most significant drawback is the newly-introduced single point of failure and the inability to cre-
ate failure domains as a result. Some SDN scaling architectures suggest simply adding additional
controllers for fault tolerance or to create a hierarchy of controllers for larger networks. While this is
a valid technique, it somewhat invalidates the “centralized” model because with multiple controllers,
the distributed control-plane is reborn. The controllers still must synchronize their routing information
using some network-based protocol and the possibility of inconsistencies between the controllers is
real. When using this multi-controller architecture, the network designer must understand that there
is, in fact, a distributed control-plane in the network; it has just been moved around. The failure of
all controllers means the entire failure domain supported by those controllers will be inoperable. The
failure of the communication paths between controllers could likewise cause inconsistent/intermittent
problems with forwarding, just like a fully distributed control-plane. OpenFlow is a good example of
a fully-centralized model. Nodes colored gray in the diagram that follows have no standalone control
There are many trade-offs between the different SDN models. The table that follows attempts to capture
the most important ones. Looking at the SDN market at the time of this writing, many solutions seem to be
either hybrid or augmented models. SD-WAN solutions, such as Cisco Viptela, only make changes at the
edge of the network and use overlays/tunnels as the primary mechanism to implement policy.
Ansible playbooks are collections of plays. Each play targets a specific set of hosts and contains a list of
tasks. In YAML, arrays/lists are denoted with a hyphen (-) character. The first play in the playbook begins
with a hyphen since it’s the first element in the array of plays. The play has a name, target hosts, and some
other minor options. Gathering facts can provide basic information like time and date, which are used in
this script. When connection: local is used, the python commands used with Ansible are executed on the
control machine (Linux) and not on the target. This is required for many Cisco devices being managed by
the CLI.
The first task defines a credentials dictionary. This contains transport information like SSH port (default is
22), target host, username, and password. The ios_config and ios_command modules, for example, re-
quire this to log into the device. The second task uses the ios_config module to issue specific commands.
The commands will specify the SNMPv3 user/group and update the auth/priv passwords for that user. For
accountability reasons, a timestamp is written to the configuration as well using the “facts” gathered earlier
in the play. Minor options to the ios_config module, such as save_when: always and match: none are
optional. The first option saves the configuration after the commands are issued while the second does not
care about what the router already has in its configuration. The commands in the task will forcibly overwrite
whatever is already configured; this is not typically done in production, but is done to illustrate a simple ex-
ample. The changed_when: false option tells Ansible to always report a status of ok rather than changed
which makes the script “succeed” from an operations perspective. The > operator is used in YAML to denote
folded text for readability, and the input is assumed to always be a string. This particular example is not
idempotent. Idempotent is a term used to describe the behavior of only making the necessary changes.
This implies that when no changes need to be made, the tool does nothing. Although considered a best
practice, achieving idempotence is not a prerequisite for creating effective Ansible playbooks.
[ec2-user@devbox ansible]# cat snmp.yml
- name: "IOS >> Issue commands to update SNMPv3 passwords, save config"
ios_config:
provider: "{{ CREDENTIALS }}"
commands:
- >
snmp-server user {{ snmp.user }} {{ snmp.group }} v3 auth
sha {{ snmp.authpass }} priv aes 256 {{ snmp.privpass }}
- >
snmp-server contact PASSWORDS UPDATED
{{ ansible_date_time.date }} at {{ ansible_date_time.time }}
save_when: always
match: none
changed_when: false
...
The playbook above makes a number of assumptions that have not been reconciled yet. First, one should
verify that csr1 is defined and reachable. It is configured as a static hostname-to-IP mapping in the system
hosts file. Additionally, it is defined in the Ansible hosts file as a valid host. Last, it is valuable to ping the
host to ensure that it is powered on and responding over the network. The verification for all aforementioned
steps is below.
[ec2-user@devbox ansible]# grep csr1 /etc/hosts
10.125.1.11 csr1
The final step is to execute the playbook. Debugging is enabled so that the generated commands are
shown in the output below, which normally does not happen. Note that the variable substitution, as well
as the Ansible timestamp, appears to be working. The play contained three tasks, all of which succeed.
Although gather_facts didn’t look like a task in the playbook, behind the scenes the setup module was
executed on the control machine, which counts as a task.
[ec2-user@devbox ansible]# ansible-playbook snmp.yml -v
Using /etc/ansible/ansible.cfg as config file
TASK [IOS >> Issue commands to update SNMPv3 passwords, save config] ********
ok: [csr1] =>
{
"banners": {}, "changed": false, "commands":
[
"snmp-server user USERV3 GROUPV3 v3 auth sha ABC123 priv aes 256 DEF456",
"snmp-server contact PASSWORDS UPDATED 2017-05-07 at 18:05:27"
],
"updates":
[
"snmp-server user USERV3 GROUPV3 v3 auth sha ABC123 priv aes 256 DEF456",
"snmp-server contact PASSWORDS UPDATED 2017-05-07 at 18:05:27"
]
}
To verify that the configuration was successfully applied, log into the target router to manually verify the
configuration. To confirm that the configuration was saved, check the startup-configuration manually as
well. The verification is shown below.
RTR_CSR1#show snmp contact
PASSWORDS UPDATED 2017-05-07 at 18:05:27
This simple example only scratches the surface of Ansible. The author has written a comprehensive OSPF
troubleshooting playbook which is simple to set up, executes quickly, and is 100% free. The link to the
Github repository where this playbook is hosted is provided below, and in the references section. There are
many other, unrelated Ansible playbooks available at the author’s Github page as well.
Nick’s OSPF TroubleShooter (nots) — https://github.com/nickrusso42518/nots
First, check the router configuration. There are no VRFs on the device at all. Be sure netconf-yang is
enabled, not netconf, in order for this technology to work correctly.
CSR1#show vrf
[no output]
CSR1#show running-config | include netconf-yang
netconf-yang
Run the playbook once more and the task reports ok, implying that there were no necessary changes since
the state did not change.
[centos@devbox netconf]# ansible-playbook nc_update.yml
Let’s grab the current VRF configuration using NETCONF. This is how the author grabbed the initial XML
snippet to build the jinja2 template above. Another approach could be converting the native YANG model
to XML using pyang or something like it. This playbook is a little more involved since there are some post-
processing steps needed to beautify the XML for human readability and write it to disk. Using the filter
Look at the contents of the file to see how the pieces fit together. Also notice how the proper route-target
state is in place. The ns numbering is referencing XML namespaces, which like programming names-
paces, can provide uniqueness when same-named constructs are referenced from a single program. The
namespaces shouldn’t be included when building XML templates, though. Administrators can use these
NETCONF captures as a way of doing configuration state backups also.
Next, examine the playbook. Like NETCONF, there is only one task to perform the update. This module
requires quite a bit more data, including login information given connection: local at the play level. The
other fields help construct the correct HTTP headers needed to configure the device via RESTCONF. There
are no jinja2 templates required at all.
---
# rc_update.yml
- name: "Infrastructure-as-code using RESTCONF"
hosts: routers
connection: local
tasks:
- name: "Update VRF config with HTTP PUT"
uri:
# YAML folded syntax won't work here, shown for readability only
url: >-
https://{{ ansible_host }}/restconf/data/
Cisco-IOS-XE-native:native/Cisco-IOS-XE-native:vrf
user: "ansible"
password: "ansible"
method: PUT
headers:
Content-Type: "application/yang-data+json"
Accept: "application/yang-data+json, application/yang-data.errors+json"
body_format: json
body: "{{ vrfs }}"
validate_certs: false
return_content: true
status_code:
- 200 # OK
- 204 # NO CONTENT
The device has no VRFs on it, just like before. RESTCONF will add them. Be sure restconf is enabled!
Run the playbook, and notice that the task reports ok. Like NETCONF, RESTCONF is idempotent and
easy to prgrom using Ansible. Unlike NETCONF, there is no notification in the HTTP response message
that indicates whether a change was made or not. This could be problematic if there are handlers requiring
notification, but often times is not a big issue. Administrators can see if changes were made using an HTTP
GET operation which is coming up next. It is possible that Cisco will update their RESTCONF API to include
this in the future.
[centos@devbox restconf]# ansible-playbook rc_update.yml
In case operators don’t know what the correct data structure looks like, use the uri module again for the
HTTP GET operation. The playbook below allows operators to execute an HTTP GET, collect data, and
write it to a file. It doesn’t require quite as much post-processing as XML since Ansible can beautify JSON
rather easily.
---
# rc_get.yml
- name: "Collect VRF config with RESTCONF"
hosts: routers
connection: local
Quickly run the playbook to gather the current VRF state and store it as a JSON file.
[centos@devbox restconf]# ansible-playbook rc_get.yml
Check the contents of the file to see the JSON returned from RESTCONF. Operators can use this as their
variables template starting point. Simply modify this JSON structure, optionally converting to YAML first if
that is easier, and pass the result into Ansible to manage your infrastructure as code using JSON instead
of CLI commands.
[centos@devbox restconf]# cat vrf_configs/csr1_restconf.json
{
"Cisco-IOS-XE-native:vrf": {
"definition": [
{
"address-family": {
"ipv4": {},
"ipv6": {}
},
"description": "FIRST VRF",
"name": "VPN1",
"rd": "1:1",
Nornir is comprised of several main components. First, an optional configuration file is used to specify
global parameters, typically default settings for the execution of Nornir runbooks, which can simplify Nornir
coding later. The same concept exists in Ansible. Exploring the configuration file is not terribly important to
understanding Nornir basics and is not covered in this demonstration.
Also like Ansible, Nornir supports robust options for managing inventory, which is a collection of hosts and
groups. Nornir can even consume existing Ansible inventories for those looking to migrate from Ansible to
Nornir. The inventory file is called hosts.yaml and is required when using Nornir’s default inventory plugin.
The groups file is called groups.yaml and is optional, though often used. Many more advanced inventory
options exist, but this demonstration uses the “simple” inventory method, which is the default.
The simplest possible hosts.yaml file is shown below. There are many other minor options for host fields,
such as a site identifier, role, and group list. This demonstration uses only a single CSR1000v, named as
such in the inventory as a top level key. The variables specific to this host are the subways listed under it.
---
# hosts.yaml
csr1000v:
hostname: "csr1000v.lab.local" # or IP address
username: "cisco"
password: "cisco"
platform: "ios"
For the sake of a more interesting example, consider the case of multiple CSR1000v routers with the same
login information. Copy/pasting host-level variables such as usernames and passwords is undesirable,
especially at scale, so using group-level variables via groups.yaml is a better design. Each CSR is assigned
to group csr which contains the common login information as group-level variables. While the format differs
from Ansible’s YAML inventory, the general logic of data inheritance is the same. More generic variable
definitions, such as group variables, can be overridden on a per-host basis if necessary.
---
# hosts.yaml
csr1000v_1:
hostname: "172.16.1.1"
groups: ["csr"]
csr1000v_2:
hostname: "172.16.1.2"
groups: ["csr"]
csr1000v_3:
hostname: "172.16.1.3"
groups: ["csr"]
csr1000v_4:
---
# groups.yaml
csr:
username: "cisco"
password: "cisco"
platform: "ios"
The demonstration below is a simple runbook from Patrick Ogenstad, one of the Nornir developers. The
author has adapted it slightly to fit this book’s format and added comments to briefly explain each step. The
Python file below is named get_facts_ios.py.
from nornir import InitNornir
from nornir.plugins.tasks.networking import napalm_get
from nornir.plugins.functions.text import print_result
Running this code yields the following output. Like Ansible, individual tasks are printed in easy-to-delineate
stanzas which contains specific output from that task. Here, the data returned by the device is printed,
along with many of the dictionary keys needed to access individual fields, if necessary. This simple method
is great for troubleshooting but often times, programmers will have to perform specific actions on specific
pieces of data.
[ec2-user@devbox nornir-test]# python3 get_facts_ios.py
napalm_get**************************************************************
* csr1000v ** changed : False ******************************************
vvvv napalm_get ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
{ 'get_facts': { 'fqdn': 'CSR1000v.ec2.internal',
'hostname': 'CSR1000v',
'interface_list': ['GigabitEthernet1', 'VirtualPortGroup0'],
'model': 'CSR1000V',
'os_version': 'Virtual XE Software '
'(X86_64_LINUX_IOSD-UNIVERSALK9-M), Version '
'16.9.1, RELEASE SOFTWARE (fc2)',
'serial_number': '9RJTDVAF3DP',
'uptime': 5160,
'vendor': 'Cisco'}}
^^^^ END napalm_get ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The result object is a key component in Nornir, albeit a complex one. The general structure is as fol-
lows, shown in pseudo-YAML format with some minor technical inaccuracies intentionally. This quick visual
indication can help those new to Nornir to understand the general structure of data returned by a Nornir
run.
result_from_nornir:
More accurately, the result_from_nornir is not a pure dictionary but is a dict-like object called AggregatedResult,
which combines all of the results across all hosts. Each host is referenced by hostname as a dictionary key,
which returns a MultiResult object. This is a list-like structure which can be indexed by integer, iterated
over, sliced, etc. The elements of these lists are Result objects which contain extra interesting data that is
be accessible from a given task. This extra interesting data is wrapped in a dictionary which is accessible
through the result attribute of the object, NOT indexable as a dictionary key. The pseudo-YAML below is
slightly more accurate in showing the object structure used for Nornir results.
AggregatedResult:
MultiResult:
- Result:
changed: !!bool
failed: !!bool
name: !!str
result:
specific_field1: ...
specific_field2: ...
- Result: ...
MultiResult:
- Result: ...
- Result: ...
If this seems tricky, it is, and the demonstration below helps explain it. Without digging into the source code
of these custom objects, one can use the Python debugger (pdb) to do some basic discovery. This under-
standing makes programmatically accessing individual fields easier, which Nornir automatically parses and
stores as structured data. Simply add this line of code to the end of the Python script above. This is the
programming equivalent of setting a breakpoint; Python calls them traces.
import pdb; pdb.set_trace()
After running the code and seeing the pretty JSON output displayed, a (Pdb) prompt waits for user input.
Mastering pdb is outside the scope of this book and we will not be exploring pdb-specific commands in any
depth. What pdb enables is a real-time Python command line environment, allowing us to inject arbitrary
code at the trace. Just type facts to start, the name of the object returned by the Nornir run. This alone
reveals a fair amount of information.
(Pdb) facts
AggregatedResult (napalm_get): {'csr1000v': MultiResult: [Result: "napalm_get"]}
First, the facts object is an AggregatedResult, a dict-like object as annotated by the curly braces with
key:value mappings inside. It has one key called csr1000v, the name of our test host. The value of this
key is a MultiResult object which is a list-like structure as annotated by the [square brackets]. Thus,
pdb should indicate that facts['csr1000v'] returns a MultiResult object, which contains a Result object
named napalm_get.
(Pdb) facts['csr1000v']
MultiResult: [Result: "napalm_get"]
Since there was only 1 task that Nornir ran (getting the IOS facts), the length of this list-like object should
be 1. Quickly test that using the Python len() function.
The Result object has some metadata fields, such as changed and failed (much like Ansible) to indicate
what happened when a task was executed. The real meat of the results is buried in a field called result.
Using Python’s dir() function to explore these fields is useful, as shown below. For brevity, the author has
manually removed some fields not relevant to this discovery exercise.
(Pdb) dir(facts['csr1000v'][0])
[..., 'changed', 'diff', 'exception', 'failed', 'host', 'name', 'result', 'severity_level']
Feel free to casually explore some of these fields by simply referencing them. For example, since this was
a read-only task that succeeded, both changed and failed fields should be false. If this were a task with
configuration changes, changed could potentially be true if actual changes were necessary. Also note that
the name of this task was napalm_get, the default name as our script did not specify one. Nornir can
consume netmiko and NAPALM connection handlers, which provides expansive support for many network
platforms, and this helps prove it.
(Pdb) facts['csr1000v'][0].changed
False
(Pdb) facts['csr1000v'][0].failed
False
(Pdb) facts['csr1000v'][0].name
'napalm_get'
After digging through all of the custom objects, we can test the result field for its type, which results in a
basic dictionary with a top-level key of get_facts. The value is another dictionary with a handful of keys
containing device information. Simply printing out this field displays the dictionary that was pretty-printed by
the print_result() function shown earlier. The long get_facts dict output is broken up to fit the screen.
(Pdb) type(facts['csr1000v'][0].result)
<class 'dict'>
(Pdb) facts['csr1000v'][0].result
{'get_facts': {'uptime': 2340, 'vendor': 'Cisco',
'os_version': 'Virtual XE Software (X86_64_LINUX_IOSD-UNIVERSALK9-M),
Version 16.9.1, RELEASE SOFTWARE (fc2)', 'serial_number': '9RJTDVAF3DP',
'model': 'CSR1000V', 'hostname': 'CSR1000v', 'fqdn': 'CSR1000v.ec2.internal',
'interface_list': ['GigabitEthernet1', 'VirtualPortGroup0']}}
Using pdb to reference individual fields, we can add some custom code to test our understanding. For
example, suppose we want to create a string containing the hostname and serial number in a hyphenated
string. Using the new f-string feature of Python 3.6, this is simple and clean.
(Pdb) data = facts['csr1000v'][0].result['get_facts']
(Pdb) important_info = f"{data['hostname']}-{data['serial_number']}"
(Pdb) important_info
'CSR1000v-9RJTDVAF3DP'
Armed with this new understanding, we can add these exact lines to our existing runbook and continue
development using the data dictionary as a handy shortcut to access the IOS facts.
It is worthwhile to explain Nornir’s run() function in greater depth. The run() function takes in a task
object, which is just another function. Because everything can be treated like an object in Python, passing
functions as parameters into other functions to be executed later is easy. This parameter function is a task
and contains the logic to perform some action, like run a command, gather facts, or make configuration
def main():
# Run the grouped task function to get facts and apply config.
from_tasks = nr.run(task=manage_router, config_lines=services)
if __name__ == '__main__':
main()
Running this code yields the following output. Tasks are printed out in the sequence in which they were
invoked. This particular router required Nagle and sequence-number services to be enabled, and needed
manage_router***********************************************************
* csr1000v ** changed : True ***************************************************
vvvv manage_router ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False ------------------------------------- INFO
{ 'get_facts': { 'fqdn': 'CSR1000v.ec2.internal',
'hostname': 'CSR1000v',
'interface_list': ['GigabitEthernet1', 'VirtualPortGroup0'],
'model': 'CSR1000V',
'os_version': 'Virtual XE Software '
'(X86_64_LINUX_IOSD-UNIVERSALK9-M), Version '
'16.9.1, RELEASE SOFTWARE (fc2)',
'serial_number': '9RJTDVAF3DP',
'uptime': 1560,
'vendor': 'Cisco'}}
---- napalm_configure ** changed : True -------------------------------- INFO
+service nagle
+service sequence-numbers
-no service pad
^^^^ END manage_router ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Because NAPALM is idempotent with respect to IOS configuration management, running the runbook again
should yield no changes when the napalm_configure task is executed. The changed return value changes
from True in the previous output to False below. No diff is supplied as a result.
[ec2-user@devbox nornir-test]# python3 manage_router_ios.py
manage_router***********************************************************
* csr1000v ** changed : False **************************************************
vvvv manage_router ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
---- napalm_get ** changed : False ------------------------------------- INFO
{ 'get_facts': { 'fqdn': 'CSR1000v.ec2.internal',
'hostname': 'CSR1000v',
'interface_list': ['GigabitEthernet1', 'VirtualPortGroup0'],
'model': 'CSR1000V',
'os_version': 'Virtual XE Software '
'(X86_64_LINUX_IOSD-UNIVERSALK9-M), Version '
'16.9.1, RELEASE SOFTWARE (fc2)',
'serial_number': '9RJTDVAF3DP',
'uptime': 2040,
'vendor': 'Cisco'}}
---- napalm_configure ** changed : False ------------------------------- INFO
^^^^ END manage_router ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Rerunning the code with a pdb trace applied at the end of the program allows Nornir users to explore the
from_tasks variable in more depth. For each host (in this case csr1000v), there is a list of MultiResult
objects. This list includes results from the wrapper function, not just the inner tasks, so its length should be
3: the grouped function followed by the 2 tasks. For troubleshooting they can be indexed as shown below.
Notice the empty-string diff returned by NAPALM from the second task, an indicator that our network hasn’t
experienced any changes since the last Nornir run.
[ec2-user@devbox nornir-test]# python3 manage_router_ios.py
> /home/ec2-user/nornir-test/manage_router_ios.py(31)main()
-> print_result(from_tasks)
(Pdb) from_tasks
AggregatedResult (manage_router): {'csr1000v': MultiResult:
(Pdb) from_tasks['csr1000v']
MultiResult: [Result: "manage_router", Result: "napalm_get",
Result: "napalm_configure"]
(Pdb) len(from_tasks['csr1000v'])
3
(Pdb) from_tasks['csr1000v'][0]
Result: "manage_router"
(Pdb) from_tasks['csr1000v'][1]
Result: "napalm_get"
(Pdb) from_tasks['csr1000v'][2]
Result: "napalm_configure"
(Pdb) from_tasks['csr1000v'][2].diff
''
return sp
def resolve_msp(self):
"""
Given two vectors of equal length, the minimum scalar product is
the smallest number that exists given all permutations of
multiplying numbers between the two vectors.
"""
This Github account is used to demonstrate a revision control example. Suppose that a change to the
Python script above is required, and specifically, a trivial comment change. Checking the git status first,
the repository is up to date as no changes have been made. It explores git at a very basic level and does
not include branches, forks, pull requests, etc.
Nicholass-MBP:min-scalar-prod nicholasrusso# git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean
### OPEN THE TEXT EDITOR AND MAKE CHANGES (NOT SHOWN) ###
git status now reports that VectorPair.py has been modified but not added to the set of files to be
committed to the repository. The changes not staged for commit indicates that the files are not currently
in the staging area.
Nicholass-MBP:min-scalar-prod nicholasrusso# git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: VectorPair.py
no changes added to commit (use "git add" and/or "git commit -a")
modified: VectorPair.py
Next, the file is committed with a comment explaining the change. This command does not update the
Github repository, only the local one. Code contained in the local repository is, by definition, one program-
mer’s local work. Other programmers may be contributing to the remote repository while another works
locally for some time. This is why git is considered a “distributed” version control system.
Nicholass-MBP:min-scalar-prod nicholasrusso# git commit -m "evolving tech comment update"
[master 74ed39a] evolving tech comment update
1 file changed, 2 insertions(+), 2 deletions(-)
Looking into the min-scalar-prod directory and specifically the VectorPair.py file, git clearly displays the
Note that the permissions of the Development group should include AWSCodeCommitFullAccess.
Navigating to the CodeCommit service, create a new repository called awsgit without selecting any other
fancy options. This initializes and empty repository. This is the equivalent of creating a new repository in
Github without having pushed any files to it.
Next, perform a clone operation from the AWS CodeCommit repository using HTTPS. While the repository
is empty, this establishes successful connectivity with AWS CodeCommit.
Nicholass-MBP:projects nicholasrusso# git clone \
> https://git-codecommit.us-east-1.amazonaws.com/v1/repos/awsgit
Cloning into 'awsgit'...
Username for 'https://git-codecommit.us-east-1.amazonaws.com': nrusso-at-043535020805
Password for 'https://[email protected]':
warning: You appear to have cloned an empty repository.
Checking connectivity... done.
```
code
block
```
Following the basic git workflow, we add the file to the staging area, commit it to the local repository, then
push it to AWS CodeCommit repository called awsgit.
Nicholass-MBP:awsgit nicholasrusso# git add .
Check the AWS console to see if the file was correctly received by the repository. It was, and even better,
CodeCommit supports Markdown rendering just like Github, Gitlab, and many other GUI-based systems.
To build on this basic repository, we can enable continuous integration (CI) using AWS CodeBuild service. It
ties in seamlessly to CodeCommit which, unlike other common integrations (Github + Jenkins) which require
many manual steps. The author creates a sample project below based on Fibonacci numbers, which are
numbers whereby the next number is the sum of the previous two. Some additional error-checking is added
to check for non-integer inputs, which makes the test cases more interesting. The Python file below is called
fibonacci.py.
#!/bin/python
def fibonacci(n):
if not isinstance(n, int):
raise ValueError('Please use an integer')
elif n < 2:
return n
else:
return fibonacci(n-1) + fibonacci(n-2)
Any good piece of software should come with unit tests. Some software development methodologies, such
as Test Driven Development (TDD), even suggest writing the unit tests before the code itself! Below are the
enumerated test cases used to test the Fibonacci function defined above. The three test cases evaluate
zero/negative number inputs, bogus string inputs, and valid integer inputs. The test script below is called
fibotest.py.
import unittest
from fibonacci import fibonacci
class fibotest(unittest.TestCase):
def test_input_zero_neg(self):
self.assertEqual(fibonacci(0), 0)
self.assertEqual(fibonacci(-1), -1)
self.assertEqual(fibonacci(-42), -42)
def test_input_invalid(self):
try:
n = fibonacci('oops')
self.fail()
except ValueError:
pass
except:
self.fail()
def test_input_valid(self):
self.assertEqual(fibonacci(1), 1)
self.assertEqual(fibonacci(2), 1)
self.assertEqual(fibonacci(10), 55)
self.assertEqual(fibonacci(20), 6765)
self.assertEqual(fibonacci(30), 832040)
The test cases above are executed using the unittest toolset which loads in all the test functions and
executes them in a test environment. The file below is called runtest.py.
#!/bin/python
import unittest
import sys
from fibotest import fibotest
def runtest():
testRunner = unittest.TextTestRunner()
testSuite = unittest.TestLoader().loadTestsFromTestCase(fibotest)
testRunner.run(testSuite)
runtest()
To manually run the tests, simply execute the runtest.py code. There are, of course, many different ways
to test Python code. A simpler alternative could have been to use pytest but using the unittest strategy
is just as effective.
Nicholass-MBP:awsgit nicholasrusso# python runtest.py
...
----------------------------------------------------------------------
Ran 3 tests in 0.970s
OK
However, the goal of CodeBuild is to offload this testing to AWS based on triggers, which can be manual
scheduling, commit-based, time-based, and more. In order to provide the build specifications for AWS so it
knows what to test, the buildspec.yml file can be defined. Below is simple, one-stage CI pipeline that just
runs the test code we developed.
# buildspec.yml
phases:
pre_build:
commands:
- python runtest.py
Add, commit, and push these new files to the repository (not shown). Note that the author also added a
.gitignore file so that the Python machine code (.pyc) files would be ignored by git. Verify that the source
code files appear in CodeCommit.
Click on the fibonacci.py file as a sanity check to ensure the text was transferred successfully. Notice that
CodeCommit does some syntax highlighting to improve readability.
At this point, you can schedule a build in CodeBuild to test out your code. The author does not walk through
setting up CodeBuild because there are many tutorials on it, and it is simple. A basic screenshot below
shows the process at a high level. CodeBuild will automatically spin up a test instance of sorts (in this case,
Ubuntu Linux with Python 3.5.2) to execute the buildspec.yml file.
After the manual build (in our case, just a unit test, we didn’t “build” anything), the detailed results are
displayed on the screen. The phases that were not defined in the buildspec.yml file, such as INSTALL,
BUILD, and POST_BUILD, instantly succeed as they do not exist. Actually testing the code in the PRE_BUILD
phase only took 1 second. If you want to see this test take longer, define test cases use larger numbers for
the Fibonacci function input, such as 50.
Below these results is the actual machine output, which matches the test output we generated when running
the tests manually. This indicates a successful CI pipeline integration between CodeCommit and CodeBuild.
Put another way, it is a fully integrated development environment without the manual setup of Github +
Jenkins, Bitbucket + Travis CI, or whatever other combination of SCM + CI you can think of.
Note that build history, as it is in every CI system, is also available. The author initially failed the first build
test due to a configuration problem within buildspec.yml, which illustrates the value of maintaining build
history.
The remainder of this section is focused on SVN client-side operations, where the author uses another
Amazon Linux EC2 instance to represent a developer’s workstation.
First, SVN must be installed using the command below. Like git, it is a relatively small program with a few
small dependencies. Last, ensure the svn command is in your path, which should happen automatically.
[root@devbox ec2-user]# yum install subversion
Loaded plugins: amazon-id, rhui-lb, search-disabled-repos
[snip]
Installed:
subversion.x86_64 0:1.7.14-14.el7
Complete!
Use the command below to checkout (similar to git’s pull or clone) the empty repository built on the SVN
server. The author put little effort into securing this environment, as evidenced by using HTTP and without
any data protection on the server itself. Production repositories would likely not see the authentication
warning below.
[root@devbox ~]# svn co --username nrusso http://svn.njrusmc.net/svn/repo1 repo1
Authentication realm: <http://svn.njrusmc.net:80> SVN Repos
Password for 'nrusso':
-----------------------------------------------------------------------
ATTENTION! Your password for authentication realm:
[snip password warning]
Checked out revision 0.
The SVN system will automatically create a directory called "repo1" in the
working directory where the SVN checkout was performed. There are no
version-controlled files in it, since the repository has no code yet.
Next, change to this repository directory and look at the repository information. There is nothing particularly
interesting, but it is handy in case you forget the URL or current revision.
[root@devbox ~]# cd repo1/
Quickly test the code by executing it with the command below (not that the mathematical correctness matters
for this demonstration).
[root@devbox repo1]# python svn_test.py
2^4 is 16
3^5 is 243
4^6 is 4096
5^7 is 78125
Like git, SVN has a status option. The question mark next to the new Python files suggests SVN does
not know what this file is. In git terms, it is an untracked file that needs to be added to the version control
system.
[root@devbox repo1]# svn status
? svn_test.py
The SVN add command is somewhat similar to git add with the exception that files are only added once. In
git, add moves files from the working directory to the staging area. In SVN, add moves untracked files into
a tracked status. The A at the beginning of the line indicates the file was added.
[root@devbox repo1]# svn add svn_test.py
A svn_test.py
In case you missed the output above, you can use the status command (st is a built-in alias) to verify that
the file was added.
[root@devbox repo1]# svn st
A svn_test.py
The last step involves the commit action to push changes to the SVN repository. The output indicates we
are now on version 1.
The SVN status shows no changes. This similar to a git “clean working directory” but is implicit given the
lack of output.
[root@devbox repo1]# svn st
[root@devbox repo1]#
Below are screenshots of the repository as viewed from a web browser. Now, our new file is present.
As in most git-based repository systems with GUIs, such as Github or Gitlab, you can click on the file to
see its contents. While this version of SVN server is a simple Apache2-based, no-frills implementation, this
feature still works. Clicking on the hyperlink reveals the source code contained in the file.
Next, make some changes to the file. In this case, remove one test case and add a new one. Verify the
changes were saved.
SVN status now reports the file as modified, similar to git. Use the diff command to view the changes.
Plus signs (+) and minus signs (-) are used to indicate additions and deletions, respectively.
[root@devbox repo1]# svn status
M svn_test.py
Unlike git, there is no staging area, so the add command used again fails. The file is already under version
control and so can be directly committed to the repository.
[root@devbox repo1]# svn add svn_test.py
svn: warning: W150002: '/root/repo1/svn_test.py' is already under version control
svn: E200009: Could not add all targets because some targets are already versioned
svn: E200009: Illegal target for the requested operation
Using the built-in ci alias for commit, push the changes to the repository. The current code version is
incremented to 2.
[root@devbox repo1]# svn ci svn_test.py -m"different numbers"
Sending svn_test.py
Transmitting file data .
Committed revision 2.
To view log entries, use the update command first to bring changes from the remote repository into our
workspace. This ensures that the subsequent log command works correctly, similar to git’s log command.
Using the verbose option, one can see all of the relevant history for these code modifications.
[root@devbox repo1]# svn update
Updating '.':
At revision 2.
different numbers
------------------------------------------------------------------------
r1 | nrusso | 2018-05-05 10:47:03 -0400 (Sat, 05 May 2018) | 1 line
Changed paths:
A /svn_test.py
In a production environment, one might leverage Kubernetes to maintain several pods, each of which runs
one instance of Batfish, to provide increased scale and availability. Putting all of the Batfish pods behind
a common Kubernetes service (effectively a DNS hostname) is one approach to building an enterprise-
grade Batfish deployment. In the interest of simplicity, this demo will employ Batfish to analyze two large
OSPF networks. These are Cisco Live presentations that I’ve delivered in the past and each one has
roughly 20 network devices. The BRKRST-3310 session focuses on troubleshooting and automation while
the DIGRST-2337 session focuses on design and deployment. The hyperlinks lead to the configuration
repositories for each session. Those repositories are cloned from GitHub below.
[ec2-user@devbox bf]# git clone https://github.com/nickrusso42518/ospf_brkrst3310.git
Cloning into 'ospf_brkrst3310'...
remote: Enumerating objects: 133, done.
remote: Total 133 (delta 0), reused 0 (delta 0), pack-reused 133
Receiving objects: 100% (133/133), 342.87 KiB | 0 bytes/s, done.
Resolving deltas: 100% (90/90), done.
Batfish consumes information by encapsulating the relevant data into “snapshots”. A snapshot is repre-
sented on the filesystem as a hierarchical directory structure with a variety of subdirectories. The only
relevant directory in this demo is configs/ which contains network device configurations (Batfish does not
care about file extensions). More generally, a snapshot is a collection of configurations for a given network
at a given point in time. Batfish can operate on multiple networks independently, each with many snapshots.
Within a given network, you can analyze the differences between any pair of snapshots.
[ec2-user@devbox bf]# mkdir -p snapshots/brkrst3310/configs
[ec2-user@devbox bf]# cp ospf_brkrst3310/final-configs/*.txt snapshots/brkrst3310/configs/
[ec2-user@devbox bf]# ls -1 snapshots/brkrst3310/configs/
R10.txt
R11.txt
R12.txt
(snip)
The output below reveals the full tree structure. The snapshot directory is named brkrst3310 and the
configs/ subdirectory contains all of the network device configurations. To add additional snapshots for
other networks, simply create a new directory under the snapshots/ parent directory. For now, ignore the
other (empty) directories.
[ec2-user@devbox bf]# tree snapshots/ --charset==ascii
snapshots/
`-- brkrst3310
|-- batfish
|-- configs
| |-- R10.txt
| |-- R11.txt
(snip)
| |-- R8.txt
| `-- R9.txt
"""
Author: Nick Russo
Purpose: Tests Batfish on sample Cisco Live sessions focused
on the OSPF routing protocol using archived configurations.
"""
import sys
import json
import pandas
from pybatfish.client.commands import *
from pybatfish.question import bfq, load_questions
def main(directory):
"""
Tests Batfish logic on a specific snapshot directory.
"""
# Generate CSV data using pipe separator (bf data has commas)
csv_data = pandas_frame.to_csv(sep="|")
with open(f"{file_name}.csv", "w") as handle:
handle.write(csv_data)
if __name__ == "__main__":
# Check for at least 2 CLI args; fail if absent
if len(sys.argv) < 2:
print("usage: python bf.py <snapshot_dir_name>")
sys.exit(1)
After a few seconds, the script completes, and the outputs/ directory contains 16 new files (4 questions
asked * 4 output formats). The script asked Batfish for OSPF area, interface, process, and neighbor infor-
mation specifically. The full list of supported Batfish questions is listed in the documentation.
[ec2-user@devbox bf]# ls -1 outputs/
area_brkrst3310.csv
area_brkrst3310.html
area_brkrst3310.json
area_brkrst3310.pandas.txt
intf_brkrst3310.csv
intf_brkrst3310.html
intf_brkrst3310.json
intf_brkrst3310.pandas.txt
nbrs_brkrst3310.csv
nbrs_brkrst3310.html
nbrs_brkrst3310.json
nbrs_brkrst3310.pandas.txt
proc_brkrst3310.csv
proc_brkrst3310.html
proc_brkrst3310.json
proc_brkrst3310.pandas.txt
We’ll examine one of each file corresponding to one of each feature. Starting with the OSPF area JSON
file, we see a list of dictionaries. Each dictionary describes a different OSPF area from the perspective of
a network device. In this case, Batfish says R6 has area 4 configured as an NSSA. R4 also has three
interfaces in that area, one of which is passive. Regarding R2, it has area 1 configured as a standard area
with only one active interface participating in that area. All of these statements are true; you can check the
GitHub configurations or topology diagram yourself if you like.
[ec2-user@devbox bf]# head -n 26 outputs/area_brkrst3310.json
[
{
"Node": "r6",
"VRF": "default",
"Process_ID": "1",
"Area": "4",
"Area_Type": "NSSA",
"Active_Interfaces": [
"Ethernet0/0",
"Serial1/1"
],
"Passive_Interfaces": [
"Loopback0"
]
},
Rather than scrub the file, it makes more sense to examine a web browser screenshot as shown below.
Some rows have been deleted for brevity. Because the table is very wide and will be hard to read in this
book, the author has manually shortened some column names. At a glance, the data looks correct, as all
Ethernet interfaces in the topology typically have a cost of 10, use standard OSPF hello/dead timers, are
not passive (i.e., links between devices), and use the P2P network type.
Next, let’s examine the OSPF process CSV file. Using the column command, an engineer can view a
tabular file without needing a spreadsheet application. Note that this particular “answer” embeds commas
in the data, so the Python script used the pipe (|) character instead. Again, the author has shortened some
column names to keep the table clean. Like the JSON and HTML files, this data is correct per the network
topology.
[ec2-user@devbox bf]# column -s'|' -t outputs/proc_brkrst3310.csv | less -S
Last, we can view a string representation of the raw pandas data frame, which is presented in a table-like
format. It’s a long file (38 lines) so we’ll examine the first several lines for brevity.
[ec2-user@devbox bf]# wc outputs/nbrs_brkrst3310.pandas.txt
38 116 1520 outputs1/nbrs_brkrst3310.pandas.txt
Then, run the bf.py script and pass in “digrst2337”, the directory name, as a command-line argument.
Some output has been omitted for brevity.
[ec2-user@devbox bf]# python bf.py digrst2337
status: TRYINGTOASSIGN
.... no task information
status: ASSIGNED
.... 2020-12-20 14:56:39.149000+00:00 Begin job.
status: ASSIGNED
.... 2020-12-20 14:56:39.149000+00:00 Parse network configs 1 / 20.
status: ASSIGNED
.... 2020-12-20 14:56:39.149000+00:00 Parse network configs 2 / 20.
(snip)
status: TERMINATEDNORMALLY
.... 2020-12-20 14:56:43.849000+00:00 Begin job.
Last, review the output files generated by the script as it relates to the specified snapshot. For those
interested in scrubbing the data in greater depth, all of these files have been uploaded to their respective
Cisco Live GitHub repositories in the batfish_answers/ directory.
[ec2-user@devbox bf]# ls -1 outputs/*2337*
area_digrst2337.csv
area_digrst2337.html
area_digrst2337.json
area_digrst2337.pandas.txt
intf_digrst2337.csv
intf_digrst2337.html
intf_digrst2337.json
intf_digrst2337.pandas.txt
nbrs_digrst2337.csv
nbrs_digrst2337.html
nbrs_digrst2337.json
nbrs_digrst2337.pandas.txt
proc_digrst2337.csv
proc_digrst2337.html
proc_digrst2337.json
proc_digrst2337.pandas.txt
As a final note, Batfish has uses beyond just network configuration analysis. As evidenced by the empty
directories above, it can trace traffic flows between hosts, even with complex iptables rulesets. More
recently, it can analyze Amazon Web Services (AWS) architectures within a Virtual Private Cloud (VPC)
instance. From a business perspective, integrating Batfish into CI/CD pipelines in a pre-check or post-check
As a general comment, one IoT strategy is to “mesh under” and “route over”. This loosely follows the 7-
layer OSI model by attempting to constrain layers 1 and 2 to the IoT network, to include RF networking and
link-layer communications, then using some kind of IP overlay of sorts for network reachability. Additional
details about routing protocols for IoT are discussed later in this document.
The mobility of an IoT device is going to be largely determined by its access method. Devices that are on
802.11 Wi-Fi within a factory will likely have mobility through the entire factory, or possibly the entire com-
plex, but will not be able to travel large geographic distances. For some specific manufacturing work carts
(containing tools, diagnostic measurement machines, etc), this might be an appropriate method. Devices
connected via 4G LTE will have greater mobility but will likely represent something that isn’t supposed to be
constrained to the factory, such as a service truck or van. Heavy machinery bolted to the factory floor might
be wired since it is immobile.
Migrating to IoT need not be swift. For example, consider an organization which is currently running a virtual
private cloud infrastructure with some critical in-house applications in their private cloud. All remaining
commercial applications are in the public cloud. Assume this public cloud is hosted locally by an ISP
and is connected via an MPLS L3VPN extranet into the corporate VPN. If this corporation owns a large
manufacturing company and wants to begin deploying various IoT components, it can begin with the large
and immobile pieces.
Because IoT is so different than traditional networking, it is worth examining some of the layer-1 and layer-2
protocols relevant to IoT. One common set of physical layer enhancements that found traction in the IoT
space are power line communication (PLC) technologies. PLCs enable data communications transfer over
power lines and other electrical infrastructure devices. Two examples of PLC standards are discussed
below:
1. IEEE 1901.2–2013: This specification allows for up to 500 kbps of data transfer across alternating
current, direct current, and
non-energized electric power lines. Smart grid applications used to operate and maintain municipal
electrical delivery systems can rely on the existing power line infrastructure for limited data communi-
cations.
2. HomePlug GreenPHY: This technology is designed for home appliances such as refrigerators, stoves
(aka ranges), microwaves, and even plug-in electric vehicles (PEV). The technology allows devices to
be integrated with existing smart grid applications, similar to IEEE 1901.2–2013 discussed above. The
creator of this technology says that GreenPHY is a “manufacturer’s profile” of the IEEE specification,
and suggests that interworking is seamless.
Ethernet has become ubiquitous in most networks. Originally designed for LAN communications, it is
spreading into the WAN via “Carrier Ethernet” and into data center storage network via “Fiber Channel over
Ethernet”, to name a few examples. In IoT, new “Industrial Ethernet” standards are challenging older “field
bus” standards. The author describes some of the trade-offs between these two technology sets below.
1. Field bus: Seen as a legacy family of technologies by some, field bus networks are still widely de-
ployed in industrial environments. This is partially due to its incumbency and the fact that many end-
points on the network have interfaces that support various field bus protocols (MODBUS, CANBUS,
etc). Field bus networks are economical as transmitting power over them is easier than power over
Ethernet. Field bus technologies are less sensitive to electrical noise, have greater physical range
without repeaters (copper Ethernet is limited to about 100 meters), and provide determinism to help
keep machine communications synchronized.
2. Industrial Ethernet: To overcome the lack of deterministic and reliable transport of traditional Ether-
net within the industrial sector, a variety of Ethernet-like protocols have been created. Two examples
include EtherCAT and PROFINET. While speeds of industrial Ethernet are much slower than modern
Ethernet (10 Mbps to 1 Gbps), these technologies introduce deterministic data transfer. In summary,
There is more than meets the eye with respect to standards and compliance for street lights. Most municipal-
ities (such as counties or townships within the United States) have ordinances that dictate how street lighting
works. The light must be a certain color, must not “trespass” into adjacent streets, must not negatively affect
homeowners on that street, etc. This complicates the question above because the lines become blurred
between organizations rather quickly. In cases like this, the discussions must occur between all stakehold-
ers, generally chaired by a Government/company representative (depending on the consumer/customer),
to draw clear boundaries between responsibilities.
Radio frequency (RF) spectrum is a critical point as well. While Wi-Fi can operate in the 2.4 GHz and 5.0
GHz bands without a license, there are no unlicensed 4G LTE bands at the time of this writing. Deploying
4G LTE capable devices on an existing carrier’s network within a developed country may not be a problem.
Deploying 4G LTE in developing or undeveloped countries, especially if 4G LTE spectrum is tightly regulated
but poorly accessible, can be a challenge.
1. Identity Services Engine (ISE): Profiles devices and creates IoT group policies
Because IoT devices are often energy constrained, much of the data aggregation research has been placed
in the physical layer protocols and designs around them. The remainder of this section discusses many of
these physical layer protocols/methods and compares them. Many of these protocols seek to maximize the
network lifetime, which is the elapsed time until the first node fails due to power loss.
1. Direct transmission: In this design, there is no aggregation or clustering of nodes. All nodes send
their traffic directly back to the base station. This simple solution is appropriate if the coverage area is
small or it is electrically expensive to receive traffic, implying that a minimal hop count is beneficial.
2. Low-Energy Adaptive Clustering Hierarchy (LEACH): LEACH introduces the concept of a “cluster”,
which is a collection of nodes in close proximity for the purpose of aggregation. A cluster head (CH) is
selected probabilistically within each cluster and serves as an aggregation node. All other nodes in the
cluster send traffic to the CH, which communicates upstream to the base station. This relatively long-
haul transmission consumes more energy, so rotating the CH role is advantageous to the network as
a whole. Last, it supports some local processing/aggregation of data to reduce traffic sent upstream
(which consumes energy). Compared to direct transmission, LEACH prolongs network lifetime and
reduces energy consumption.
3. LEACH-Centralized (LEACH-C): This protocol modifies LEACH by centralizing the CH selection pro-
cess. All nodes report location and energy to the base station, which finds average of energy levels.
Those with above average remaining energy levels in each cluster are selected as CH. The base sta-
tion also notifies all other nodes of this decision. The CH may not change at regular intervals (rounds)
since the CH selection is more deliberate than with LEACH. LEACH distributes the CH role between
nodes in a probabilistic (randomized) fashion, whereby LEACH-C relies on the base station to make
this determination. The centralization comes at an energy cost since all nodes are transmitting their
current energy status back to the base station between rounds. The logic of the base station itself
also becomes more complex with LEACH-C compared to LEACH.
4. Threshold-sensitive Energy Efficiency Network (TEEN): This protocol differs from LEACH in that
it is reactive, not proactive. The radio stays off unless there is a significant change worth reporting.
This implies there are no periodic transmissions, which saves energy. Similar to the LEACH family,
each node becomes the CH for some time (cluster period) to distribute the energy burden of long-haul
communication to the base station. If the trigger thresholds are not crossed, nodes have no reason to
communicate. TEEN is excellent for intermittent, alert-based communications as opposed to routine
communications. This is well suited for event-driven, time sensitive applications.
5. Power Efficient Gathering in Sensor Info Systems (PEGASIS): Unlike the LEACH and TEEN fam-
ilies, PEGASIS is a chain based protocol. Nodes are connected in round-robin fashion (a ring); data
is sent only to a node’s neighbors, not to a CH within a cluster. These short transmission distances
his further minimize energy consumption. Rather than rotate the burdensome CH role, all nodes do a
little more work at all times. Only one node communicates with the base station. This allows nodes to
determine which other nodes are closest to them. Discovery is done by measuring the receive signal
strength indicator (RSSI) of incoming radio signals to find the closest nodes. PEGASIS is optimized
for dense networks.
6. Minimum Transmission of Energy (MTE): MTE is conceptually similar to PEGASIS in that it is a
chain based protocol designed to minimize the energy required to communicate between nodes. In
contrast with direct transmission, MTE assumes that the cost to receive traffic is low, and works well
over long distances with sparse networks. MTE is more computationally complex than PEGASIS,
again assuming that the energy cost of radio transmission is greater than the energy cost of local
processing. This may be true in some environments, such as long-haul power line systems and
interstate highway emergency phone booths.
7. Static clustering: Like the LEACH and TEEN families, static clustering requires creating geograph-
ically advantageous clusters for the purpose of transmission aggregation. This technique is static in
Note that many of these protocols are still under extensive development, research, and testing.
4.1 Cloud
4.1.1 Troubleshooting and Management
One of the fundamental tenets of managing a cloud network is automation. Common scripting languages,
such as Python, can be used to automate a specific management task or set of tasks. Other network device
management tools, such as Ansible, allow an administrator to create a custom script and execute it on many
devices concurrently. This is one example of the method by which administrators can directly apply task
automation in the workplace.
Troubleshooting a cloud network is often reliant on real-time network analytics. Collecting network perfor-
mance statistics is not a new concept, but designing software to intelligently parse, correlate, and present
the information in a human-readable format is becoming increasingly important to many businesses. With
a good analytics engine, the NMS can move/provision flows around the network (assuming the network is
both disaggregated and programmable) to resolve any problems. For problems that cannot be resolved
automatically, the issues are brought to the administrator’s attention using these engines. The administrator
can use other troubleshooting tools or NMS features to isolate and repair the fault. Sometimes these ana-
lytics tools will export reports in YAML, JSON, or XML, which can be archived for reference. They can also
be fed into in-house scripts for additional, business-specific analysis. Put simply, analytics reduces “data”
into “actionable information”.
This section briefly explores installing OpenStack on Amazon AWS as an EC2 instance. This is effec-
tively “cloud inside of cloud” and while easy and inexpensive to install, is difficult to operate. As such, this
demonstration details basic GUI navigation, CLI troubleshooting, and Nova instance creation using Cinder
for storage.
For simplicity, packstack running on CentOS7 is used. The packstack package is a pre-made collection
of OpenStack’s core services, including but not limited to Nova, Cinder, Neutron (limited), Horizon, and
Keystone. Only these five services are explored in this demonstration.
The installation process for packstack on CentOS7 and RHEL7 can be found at RDOproject. The author
recommends using a t2.large or t2.xlarge AWS EC2 instance for the CentOS7/RHEL7 base operating
system. At the time of this writing, these instances cost less than 0.11 USD/hour and are affordable,
assuming the instance is terminated after the demonstration is complete. The author used AWS Route53
(DNS) service to map the packstack public IP to http://packstack.njrusmc.net (link is dead at the time
of this writing) to simplify access (this process is not shown). This is not required, but may simplify the
packstack HTTP server configuration later. Be sure to record the DNS name of the EC2 instance if you are
not using Route53 explicitly for hostname resolution; this name is auto-generated by AWS assuming the
EC2 instance is placed in the default VPC. Also, after installation completes, take note of the instructions
from the installer which provide access to the administrator password for initial login.
export OS_PROJECT_NAME=admin
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_DOMAIN_NAME=Default
export OS_IDENTITY_API_VERSION=3
Next, if Horizon is behind a NAT device (which is generally true for AWS deployments), be sure to add a
ServerAlias in /etc/httpd/conf.d/15-horizon_vhost.conf, as shown below. This will allow HTTP GET
requests to the specific URL to be properly handled by the HTTP server running on the packstack instance.
Note that the VirtualHost tags already exist and the ServerAlias must be added within those bounds.
<VirtualHost *:80>
[snip]
ServerAlias packstack.njrusmc.net
[snip]
</VirtualHost>
The final pre-operations step recommended by the author is to reboot the system. The packstack installer
may also suggest this is required. After reboot, log back into packstack via SSH, switch to root with a full
login, and verify the hostname has been retained. Additionally, all packstack related environmental variables
should be automatically loaded, simplifying CLI operations.
Navigate to the packstack instance’s DNS hostname in a web browser. The OpenStack GUI somewhat
resembles that of AWS, which makes sense since both are meant to be cloud dashboards. The screenshot
that follows shows the main GUI page after login, which is the Identity -> Projects page. Note that
a “demo” project already exists, and fortunately for those new to OpenStack, there is an entire sample
structure built around this. This document will explore the demo project specifically.
This demonstration begins by exploring Keystone. Click on the Identity -> Users page, find the demo
user, and select “Edit”. The screen that follows shows some of the fields required, and most are self-evident
(name, email, password, etc). Update a few of the fields to add an email address and select a primary
project, though neither is required. For brevity, this demonstration does not explore groups and roles, but
these options are useful for management of larger OpenStack deployments.
Next, click on Identity -> Project and edit the demo project. The screenshots that follow depict the
demonstration configuration for the basic project information and team members. Only the demo user
is part of this project. Note that the Quota tab can be used in a multi-tenant environment, such as a
hosted OpenStack-as-a-service solution, to ensure that an individual project does not consume too many
resources.
The project members tab is shown below. Only the demo user is a member of the demo project by default,
and this demonstration does not make any modifications to this.
Under the Source tab, select Yes for Delete Volume on Instance Delete. This ensures that when the
instance is terminated, the storage along with it is deleted also. This is nice for testing when instances are
terminated regularly and their volumes are no longer needed. It’s also good for public cloud environments
where residual, unused volumes cost money (lesson learned the hard way). Click on the up arrow next to
the CirrOS instance to move it from the Available menu to the Allocated menu.
Under Flavor, select m1.tiny which is appropriate for CirrOS. These flavors are general sizing models for
the instance as it relates to compute, memory, and storage.
At this point, it is technically possible to launch the instance, but there are other important fields to consider.
It would be exhaustive to document every single instance option, so only the most highly relevant ones are
explored next.
Under Security Groups, note that the instance is part of the default security group since no explicit ones
were created. This group allows egress communication to any IPv4 or IPv6 address, but no unsolicited
ingress communication. Security Groups are stateful, so that returning traffic is allowed to reach the in-
stance on ingress. This is true in the opposite direction as well; if ingress rules were defined, returning
traffic would be allowed outbound even if the egress rules were not explicit matched. AWS EC2 instance
security groups work similarly, except only in the ingress direction. No changes are needed on this page for
this demonstration.
Packstack does not come with any key pairs by default, which make sense since the private key is only
available once at the key pair creation time. Under Key Pair, click on Create Key Pair. Be sure to store
the private key somewhere secure that provides confidentiality, integrity, and availability. Any Nova instances
deployed using this key pair can only be accessed using the private key file, much like a specific key opens
a specific lock.
After the key pair is generated, it can be viewed below and downloaded.
The OpenSSH client program (standard on Linux and Mac OS) will refuse to use private keys with their
SSH clients unless the file is secured in terms of accessibility. In this case, the file permissions are reduced
to read-only for the owning user and no others using the chmod 0400 command in Linux and Mac OS.
The command below sets the “read” permission for the file’s owner (nicholasrusso) and removes all other
permissions.
Nicholass-MBP:ssh nicholasrusso$ ls -l CirrOS1-kp.pem
-rw-r--r--@ 1 nicholasrusso staff 1679 Aug 13 12:45 CirrOS1-kp.pem
Nicholass-MBP:ssh nicholasrusso$ chmod 0400 CirrOS1-kp.pem
Nicholass-MBP:ssh nicholasrusso$ ls -l CirrOS1-kp.pem
-r--------@ 1 nicholasrusso staff 1679 Aug 13 12:45 CirrOS1-kp.pem
Click on Launch Instance and navigate back to the main Instances menu. The screenshot that follows
shows two separate CirrOS instances as the author repeated the procedure twice.
Exploring the volumes for these instances shows the iSCSI disks (block storage on Cinder) mapped to each
CirrOS compute instance.
Accessing these instances, given the “cloud inside of cloud architecture”, is non-trivial. The author does not
cover the advanced Neutron configuration to make this work, so accessing the instances is not covered in
this demonstration. Future releases of this document may detail this.
Moving back to the CLI, there are literally hundreds of OpenStack commands used for configuration and
troubleshooting of the cloud system. The author’s favorite three Nova commands are shown next. Note
that some of the columns were omitted to have it fit nicely on the screen, but the information removed was
not terribly relevant for this demonstration. The host-list shows the host names and their services. The
service-list is very useful to see if any hosts or services are down or disabled. The general list enumerates
the configured instances. The two instances created above are displayed there.
[root@ip-172-31-9-84 ~(keystone_admin)]# nova host-list
+----------------+-------------+----------+
| host_name | service | zone |
+----------------+-------------+----------+
| ip-172-31-9-84 | cert | internal |
| ip-172-31-9-84 | conductor | internal |
| ip-172-31-9-84 | scheduler | internal |
| ip-172-31-9-84 | consoleauth | internal |
| ip-172-31-9-84 | compute | nova |
+----------------+-------------+----------+
When the CirrOS instances were created, each was given a 1 GiB disk via iSCSI, which is block storage.
This is the Cinder service in action. The chart below shows each volume mapped to a given instance; note
that a single instance could have multiple disks, just like any other machine.
[root@ip-172-31-9-84 ~(keystone_admin)]# cinder list
+-----------------+--------+------+--------+----------+-----------------+
| ID | Status | Size | Volume | Bootable | Attached |
+-----------------+--------+------+--------+----------+-----------------+
| 0681343e-(snip) | in-use | 1 | iscsi | true | 9dca3460-(snip) |
| 08554c2f-(snip) | in-use | 1 | iscsi | true | 2e4607d0-(snip) |
+-----------------+--------+------+--------+----------+-----------------+
The command that follows shows basic information about the public subnet that comes with the packstack
installer by default. Neutron was not explored in depth in this demonstration.
[root@ip-172-31-9-84 ~(keystone_admin)]# neutron net-show public
+---------------------------+--------------------------------------+
| Field | Value |
+---------------------------+--------------------------------------+
| admin_state_up | True |
| availability_zone_hints | |
| availability_zones | nova |
| created_at | 2017-08-14T01:53:35Z |
| description | |
| id | f627209e-a468-4924-9ee8-2905a8cf69cf |
| ipv4_address_scope | |
| ipv6_address_scope | |
| is_default | False |
| mtu | 1500 |
| name | public |
| project_id | d20aa04a11f94dc182b07852bb189252 |
| provider:network_type | flat |
| provider:physical_network | extnet |
| provider:segmentation_id | |
| revision_number | 5 |
| router:external | True |
| shared | False |
| status | ACTIVE |
| subnets | cbb8bad6-8508-45a0-bbb9-86546f853ae8 |
| tags | |
| tenant_id | d20aa04a11f94dc182b07852bb189252 |
| updated_at | 2017-08-14T01:53:39Z |
+---------------------------+--------------------------------------+
There are a number of popular Agile methodologies. Two of them are discussed below.
1. Scrum is considered lightweight as the intent of most Agile methodologies is to maximize the amount
of productive work accomplished during a given time period. In Scrum, a “sprint” is a period of time
upon which certain tasks are expected to be accomplished. At the beginning of the sprint, the Scrum
Master (effectively a task manager) holds a ˜4 hour planning meeting whereby work is prioritized and
assigned to individuals. Tasks are pulled from the sprint backlog into a given sprint. The only meetings
thereafter (within a sprint) are typically 15 minute daily stand-ups to report progress or problems (and
advance items across the Scrum board). If a sprint is 2 weeks ( 80 hours) then only about 6 hours of
it is spent in meetings. This may or may not include a retrospective discussion at the end of a sprint to
discuss what went well/poorly. Tasks such as bugs, features, change requests, and more topics are
tracked on a “scrum board” which drives the work for the entire sprint.
2. Kanban is another Agile methodology which seeks to further optimize useful work done. Unlike
scrum, it is less structured in terms of time and it lacks the concept of a sprint. As such, there is
neither a sprint planning session nor a sprint retrospective. Rather than limit work by units of time,
it limits work by the number of concurrent tasks occurring at once. The Kanban board, therefore, is
more focused on tracking the number of tasks (sometimes called stories) within a single chronological
point in the development cycle (often called Work In Progress or WIP). The most basic Kanban board
might have three columns: To Do, Doing, Done. Ensuring that there is not too much work in any
column keeps productivity high. Additionally, there is no official task manager in Kanban, though an
individual may assume a similar role given the size/scope of the project. Finally, release cycles are
not predetermined, which means releases can be more frequent.
Although these Agile methodologies were initially intended for software development, they can be adapted
for work in any industry. The author has personally seen Scrum used within a network security engineering
team to organize tasks, limit the scope of work over a period of time, and regularly deliver production-ready
designs, solutions, and consultancy to a large customer. The author personally uses Kanban for personal
task management, as well as network operations and even home construction projects. Both strategies
have universal applicability.
To install Jenkins, issue these commands as root (indentation included for readability). Technically, some of
these commands can be issued from a non-root user. The AWS installation guide for Jenkins, included in
the references, suggests doing so as root.
wget -O /etc/yum.repos.d/jenkins.repo \
http://pkg.jenkins-ci.org/redhat/jenkins.repo
rpm --import https://pkg.jenkins.io/redhat/jenkins.io.key
yum install jenkins
service jenkins start
Verify Jenkins is working after the completing the installation. Also, download the jenkins.war file ( 64MB)
to get Jenkins CLI access, which is useful for bypassing the GUI for some tasks. Because the file is large,
users may want to run it as a background task by appending & to the command (not shown). It is used
below to check the Jenkins version.
[root@ip-10-125-0-85 .ssh]# service jenkins status
jenkins (pid 2666) is running...
Once installed, log into Jenkins at http://jenkins.njrusmc.net:8080/, substituting the correct host ad-
dress. Enable the Git plugins via Manage Jenkins > Manage Plugins > Available tab. Enter git in the
Log into Github and navigate to Settings > Developer settings > Personal access tokens. These
tokens can be used as an easy authentication method via shared-secret to access Github’s API. When
generating a new token, admin:org_hook must be granted at a minimum, but in the interest of experimen-
tation, the author selected a few other options as depicted in the image that follows.
After the token has been generated and the secret copied, go to Credentials > Global Credentials and
create a new credential. The graphic below depicts all parameters. This credential will be used for the
Github API integration.
Next, navigate to Manage Jenkins > Configure System, then scroll down to the Git and Github configura-
tions. Configure the Git username and email under the Git section. For the Github section, the secret text
authentication method should be used to allow Github API access.
The global Jenkins configuration for Git/Github integration is complete. Next, create a new repository (or
use an existing one) within Github. This process is not described as Github basics are covered elsewhere
in this book. The author created a new repository called jenkins-demo.
After creating the Github repository, the following commands are issued on the user’s machine to make a
first commit. Github provides these commands in an easy copy/paste format to get started quickly. The
assumption is that the user’s laptop already has the correct SSH integration with Github.
MacBook-Pro:jenkins-demo nicholasrusso$ echo "# jenkins-demo" >> README.md
MacBook-Pro:jenkins-demo nicholasrusso$ git init
Initialized empty Git repository in /Users/nicholasrusso/projects/jenkins-demo/.git/
MacBook-Pro:jenkins-demo nicholasrusso$ git add README.md
MacBook-Pro:jenkins-demo nicholasrusso$ git commit -m "first commit"
[master (root-commit) ac98dd9] first commit
1 file changed, 1 insertion(+)
create mode 100644 README.md
MacBook-Pro:jenkins-demo nicholasrusso$ git remote add origin \
> [email protected]:nickrusso42518/jenkins-demo.git
MacBook-Pro:jenkins-demo nicholasrusso$ git push -u origin master
After this initial commit, a simple Ansible playbook has been added as our source code. Intermediate
file creation and Git source code management (SCM) steps are omitted for brevity, but there are now two
commits in the Git log. As it relates to Cisco Evolving Technologies, one would probably commit customized
code for Cisco Network Services Orchestration (NSO) or perhaps Cisco-specific Ansible playbooks for
testing. Jenkins would be able to access these files, test it (or on a slave processing node within the Jenkins
system), and provide feedback about the build’s quality. Jenkins can be configured to initiate software builds
(including compilation) using a variety of tools and these builds can be triggered from a variety of events.
These features are not explored in detail in this demonstration.
---
# sample-pb.yml
- hosts: localhost
connection: local
gather_facts: false
tasks:
- file:
path: "/etc/ansible/ansible.cfg"
state: present
...
Next, log into the Jenkins box, wherever it exists (the author is using AWS EC2 to host Jenkins for this demo
on an m3.medium instance). SSH keys must be generated in the Jenkins users’ home directory since this is
the user running the software. In the current release of Jenkins, the home directory is /var/lib/jenkins/.
[root@ip-10-125-0-85 ~]# grep jenkins /etc/passwd
jenkins:x:498:497:Jenkins Automation Server:/var/lib/jenkins:/bin/false
The intermediate Linux file system steps to create the ~/.ssh/ directory and ~/.ssh/known_hosts file are
not shown for brevity. Additionally, generating RSA2048 keys is not shown. Navigating to the .ssh directory
is recommended since there are additional commands that use these files.
[root@ip-10-125-0-85 .ssh]# pwd
/var/lib/jenkins/.ssh
[root@ip-10-125-0-85 .ssh]# ls
id_rsa id_rsa.pub known_hosts
Next, add the Jenkins user’s public key to Github under either your personal username or a Jenkins utility
user (preferred). The author uses his personal username for brevity in this example shown in the diagram
that follows.
[root@ip-10-125-0-85 .ssh]# cat id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDd6qISM3f/mhmSeauR6DSFMhvlT8QkXyyY73Tk8Nu
f+SytelhP15gqTao3iA08LlpOBOnvtGXVwHEyQhMu0JTfFwRsTOGRRl3Yp9n6Y2/8AGGNTp+Q4tGpcz
Zkh/Xs7LFyQAK3DIVBBnfF0eOiX20/dC5W72aF3IzZBIsNyc9Bcka8wmVb2gdYkj1nQg6VQI1C6yayL
wyjFxEDgArGbWk0Z4GbWqgfJno5gLT844SvWmOWEJ1jNIw1ipoxSioVSSc/rsA0A3e9nWZ/HQGUbbhI
OGx7k4ruQLTCPeduU+VgIIj3Iws1tFRwc+lXEn58qicJ6nFlIbAW1kJj8I/+1fEj jenkins-ssh-key
The commands below verify that the keys are functional. Note that the -i flag must be used because the
command is run as root, and a different identity file (Jenkins’ user private key) should be used for this test.
[root@ip-10-125-0-85 .ssh]# ssh -T [email protected] -i id_rsa
Hi nickrusso42518! You've successfully authenticated, but GitHub does not provide shell access.
Before continuing, edit the /etc/passwd file as root to give the Jenkins user access to any valid shell (bash
or sh). Additionally, use yum or apt-get to install git so that Jenkins can issue git commands. The git
installation via yum is not shown for brevity.
[root@ip-10-125-0-85 plugins]# grep jenkins /etc/passwd
jenkins:x:498:497:Jenkins Automation Server:/var/lib/jenkins:/bin/bash
Once Git is installed and Jenkins has shell access, copy the repository URL in SSH format from Github and
substitute it as the repository argument in the command below. This is the exact command that Jenkins tries
The URL above can be copied by starting to clone the Github repository as shown below. Be sure to select
SSH to get the correct repository link.
At this point, adding a new Jenkins project should succeed when the repository link is supplied. This is an
option under SCM for the project whereby the only choices are git and None. If it fails, an error message will
be prominently displayed on the screen and the error is normally related to SSH setup. Do not specify any
credentials for this because the SSH public key method is inherent with the setup earlier. The screenshot
that follows depicts this process.
As a final check, you can view the Console Output for this project/build by clicking the icon on the left.
It reveals the git commands executed by Jenkins behind the scenes to perform the pull, which is mostly
The project workspace shows the files in the repository, which includes the newly created Ansible playbook.
This section briefly explores configuring Jenkins integration with AWS EC2. There are many more detailed
guides on the Internet which describe this process; this book includes the author’s personal journey into
setting it up. Just like with Git, the AWS EC2 plugins must be installed. Look for the AWS EC2 plugin as
shown in the diagram that follows, and install it. The Jenkins wiki concisely describes how this integration
works and what problems it can solve:
Allow Jenkins to start slaves on EC2 or Eucalyptus on demand, and kill them as they get unused. With this
plugin, if Jenkins notices that your build cluster is overloaded, it’ll start instances using the EC2 API and
automatically connect them as Jenkins slaves. When the load goes down, excessive EC2 instances will be
terminated. This set up allows you to maintain a small in-house cluster, then spill the spiky build/test loads
into EC2 or another EC2 compatible cloud.
Log into the AWS console and navigate to the Identity Access Management (IAM) service. Create a new
user that has full EC2 access which effectively grants API access to EC2 for Jenkins. The user will come
with an access ID and secret access key. Copy both pieces of information as Jenkins must know both.
Next, create a new credential of type AWS credential. Populate the fields as shown below.
Navigate back to Manage Jenkins > Configure System > Add a new cloud. Choose Amazon EC2 and
populate the credentials option with the recently created AWS credentials using the secret access key for
the IAM user jenkins. You must select a specific AWS region. Additionally, you’ll need to paste the EC2
private key used for any EC2 instances managed by Jenkins. This is not for general AWS API access but
for shell access to EC2 instances in order to control them. For security, you can create a new key pair within
AWS (recommended but not shown) for Jenkins-based hosts in case the general-purpose EC2 private key
is stolen.
You can validate the connection using the Test Connection button which should result in success.
The final step is determining what kind of AMIs Jenkins should create within AWS. There can be multiple
AMIs for different operating systems, including Windows, depending on the kind of testing that needs to be
done. Perhaps it is useful to run the tests on different OS’ as part of a more comprehensive testing strategy
for software portability. There are many options to enter and the menu is somewhat similar to launching
native instances within EC2. A subset of options is shown here; note that you can validate the spelling of
the AMI codes (accessible from the AWS EC2 console) using the Check AMI button. More details on this
process can be found in the references.
With both Github and AWS EC2 integration set up, a developer can create large projects complete with
automated testing from SCM repository and automatic scaling within the public cloud. Provided there was
a larger, complex project which requires slave processing nodes, EC2 nodes would be dynamically created
based on the need or the administrator assigned labels within a project.
Jenkins is not the only commonly used CI/CD tool. Gitlab, which is private (on-premises) version of Github,
supports source code management (SCM) and CI/CD together. A real-life example of this implementation
is provided in the references. All of these options come at a very low price and allow individuals to deploy
higher quality code more rapidly, which is a core tenant of Agile software development. The author has
participated in a number of free podcasts on CI/CD and has used a variety of different providers. These
podcasts are linked in the references.
Acronym Definition/Meaning
6LoWPAN IPv6 over Low Power WPANs
ACI Application Centric Infrastructure
AFV Application Function Virtualization
AMI Amazon Machine Instance (AWS)
API Application Programming Interface
APIC Application Policy Infrastructure Controller (ACI)
ARN Amazon Resource Name
ASA Adaptive Security Appliance (virtual)
AWS Amazon Web Services
AZ Availability Zone
BGP Border Gateway Protocol
BR Border Router
CAPEX Capital Expenditures
CCB Configuration Control Board
CI/CD Continuous Integration/Continuous Development
CH Cluster Head (see LEACH, TEEN, etc.)
CM Configuration Management
COAP Constrained Application Protocol
COTS Commercial Off The Shelf
CSP Cloud Service Provider
CSPF Constrained Shortest Path First (see MPLS, TE)
CUC Cisco Unity Connection
DC Data Center
DCN Data Center Network
DCOM Distributed Component Object Model (Microsoft)
DEEC Distributed Energy Efficient Clustering
DDEEC Developed Distributed Energy Efficient Clustering
DHCP Dynamic Host Configuration Protocol
DMVPN Dynamic Multipoint VPN
DNA Digital Network Architecture
DNA-C Digital Network Architecture Center
DNS Domain Name System
DTD Document Type Definition (see HTML)
DTLS Datagram TLS (UDP)
DVS Distributed Virtual Switch