An Adaptive Honeypot System To Capture Ipv6 Address Scans: (Yamaguchi, Yamaki, Takakura) @itc - Nagoya-U.Ac - JP
An Adaptive Honeypot System To Capture Ipv6 Address Scans: (Yamaguchi, Yamaki, Takakura) @itc - Nagoya-U.Ac - JP
An Adaptive Honeypot System To Capture Ipv6 Address Scans: (Yamaguchi, Yamaki, Takakura) @itc - Nagoya-U.Ac - JP
Email: [email protected]
‡ Information Technology Center
Abstract—The vastness of IPv6 address space and rapid spread exhaustion of IPv4 addresses, many devices cannot be assigned
of its deployment attract us to usage of IPv6 network. Various a global IP address and should be deployed under NAT which
types of devices, including embedded systems, are ready to use also plays the role of a substitution of a firewall. On the other
IPv6 addresses and some of them have already been connected
directly to the Internet. Such situation entices attackers to change hand, the vastness of IPv6 allows various types of devices to
their strategies and choose the embedded systems as their targets. connect to the Internet directly.
We have to deploy various types of honeypots on IPv6 network Stateless address autoconfigulation (SLAAC) is convention-
to trace his activities and infer his objective. Huge address space ally used when a device choose its IPv6 address. In case
and wide variety of devices, however, suggest the limitation of
of SLAAC, most devices generate IPv6 addresses from their
conventional honeypots. In this paper, we propose a system that
dynamically assigns an address to a honeypot by detecting an MAC addresses. A MAC address contains the information to
access to an unassigned address. We also present our strategy know its vendor and its model of the device. As a result, an
against IPv6 address scans by making honeypots collaborate each IPv6 address derived from a MAC address is dependent on
other. the vendor and the model of the device.
Index Terms—honeypot, IPv6, security
In recent year, embedded devices, e.g., home appliances,
building facilities, industrial control systems, connect to the
I. I NTRODUCTION
Internet. Many of them start to adopt IPv6 [2][3][4] and the
In recent years, serious incidents such as targeted attacks number of such devices will be increased more and more
have been reported frequently. Such kind of attacks has a hereafter. The majority of devices that connected to the IPv6
tendency to use a zero-day attack which exploits undisclosed network is considered to be such devices not PCs.
vulnerabilities. Traditional security countermeasures heavily Because of such change of the network environment, at-
depend on a detection pattern, i.e., a virus definition or an tackers focus on on not only PCs but also embedded devices.
attack signature. Since a sign of the zero-day attack can be Stuxnet was one of the typical examples of attacks to SCADA
hardly caught, we cannot obtain the detection pattern before- (Supervisory Control And Data Acquisition) systems. After
hand and the current countermeasures do not work effectively Stuxnet, several its variants and new malware which targeted
against these attacks. Furthermore, cases that incidents are to SCADA have been reported. Some of conventional propaga-
occurred in LAN, e.g., home network, office network, are tion methods cannot be used when attackers want to discover
increased. In that situation, the effectiveness of firewall and embedded devices. For example, spam mails cannot be used
IPS (Intrusion Prevention System) are limited. In order to to sensors because they are not designed to receive mails. One
defend our network from these attacks, many techniques have of the methods to discover such devices is a scan. Usually, it
been studied. Building and deploying honeypots is one of the is considered that a scan to IPv6 network is infeasible because
approaches widely known. A honeypot gathers information of of its vast address space. If an attacker targets a specific device
cyber attacks by accepting attacks. which is used in a certain organization, he can scan only
Stocks of IPv4 address blocks managed by IANA ran out address space derived from SLAAC to identify the device. In
on February 3, 2011 [1], which causes serious problems on the case of the easiest scenario, he has to search 232 address
the growth of the Internet. This fact and the events like World space1 .
IPv6 Launch started at June 6, 2012 accelerate the adoption of On the other hand, in order to gather attack information,
IPv6. A honeypot system that gathers information effectively it is required to know which address will be attacked and
in IPv6 is needed. deploy a honeypot to the address. It is difficult to predict in
In general, the address space of 64 bits, that is, 1.8 × 1019
are available in one network segment in IPv6. Because of the 1= 216 (subnet ID of prefix) and 216 (lower 16 bit of interface ID)
166
509
#$
network printers have frequently become victims of cyber
"
!
!
attacks. Stuxnet [16] attacked SCADA system and caused
serious damage to the nuclear plant. Several home appliances
were compromised and misused as stepping-stones for further
attacks.
!
!
Furthermore, wide spread of smartphones and tablets in-
troduce new trend of device usage, i.e., BYOD(Bring Your
Fig. 1. Structure of an IPv6 address derived from EUI-64 format Own Device). Since most of BYODs are ready for IPv6
and have ability of tethering, BYOD causes new security
issues, BYON(Bring Your Own Network). Many organization,
Similar to attackers, defenders have to take care of the including Nagoya University, has suffered from several issues.
situation to deploy honeypots. If we adopt a strategy similar For example, misconfigured devices tried to introduce an
to the conventional one where each honeypot is assigned fixed another IPv6 network which is not assigned the university.
IP address, the honeypots sparsely exist in IPv6 address space As another example, a compromised smartphone attempted to
of a subnet and hardly detect malicious activities. On the other hijack IPv6 sessions.
hand, if we deploy too many honeypots on a subnet, an attacker The reason for the shift can be considered as follows.
can easily identify such unnatural subnet. Therefore, we need First, the embedded devices with network connection become
some smart method to deploy honeypots in IPv6 environment. common and many of them are directly connected the Internet.
In IPv6, an IP address derived from EUI-64 format [13] is Second, their security level is relatively low comparing with
conventionally used. EUI-64 format is automatically calculated conventional PCs. Because of the reduction on production
from a MAC address. A MAC address is a unique identifier costs, these devices must be equipped with the minimum
assigned to network interfaces and consists of 48 bits. The specification of hardware resources. Such specification cannot
first 24 bits represent a vendor code managed by IEEE, and afford to installing security functions, e.g., anti-virus. Most of
the lower 24 bits are assigned by each vendor. Conventionally the embedded systems are derivation of the conventional OSs
the lower 24 bits are sequentially assigned to a device. As a like Linux, FreeBSD, Windows. Since life cycle of the devices
result, the lower 24 bits of same model devices are sequential. is very long, e.g., 10 years, and it is not an easy task to update
If high-order bits of the lower 24 bits are same, the devices are their OSs, their vulnerabilities are frequently left. An attacker
probably same model. In this paper, the first 8 bits of lower can easily understand their structures. Therefore, these devices
24 bits are termed a model identifier and the remaining 16 bits become easy targets for the attackers.
are termed a serial identifier. An EUI-64 format is generated
by inserting hex FFFE to middle of MAC address as shown E. Attacker’s strategies to scan IPv6 network
in Fig. 1 The enlarged address space of IPv6 and the SLAAC causes
Since an IP addresses derived from EUI-64 may accept new critical problems which do not exist in IPv4. In IPv6, the
incoming connection requests from other devices [14], most step 2 and 3 of II-B is more difficult than that of IPv4.
implementation of OS innocently responds to connection re- Most types of IPS monitor network traffic for a certain
quests to the addresses. This situation helps an attacker to period of time, e.g., 1 minute. For each source address,
identify his target devices. Instead of searching the entire destination IP addresses are sorted in order to find sweep
address space of a subnet, he just look around the limited range behavior. In order to evade from such detection mechanism, a
of IP addresses which are derived from EUI-64. If an attacker smart attacker frequently adopts slow scan [17]. Slow scan is
targets a specific device, he can obtain detailed knowledge of a technique that send some packets a day, often from different
its behavior. If a honeypot cannot properly respond to him, source addresses, not to be detected by IPS.
he suspects the existence of the honeypot and suspends the This paper assumes that an attacker selects one of three
interaction with it. scan strategies below, one ordinary scan and two slow scan
D. Security of networked embedded systems and IPv6 strategies.
Building facilities such as thermometers, air-conditioners 1) The attacker accesses almost all IPv6 addresses within
and lighting equipments are managed through IP network. a specific address range in a short period of time.
Furthermore, a huge number of SCADA systems is ready In this strategy, the attacker tries to discover as many
for IPv6 [2][3][4]. Even in a house, set-top boxes, TV sets, hosts as possible. Even if the IPv6 addresses are selected
DVRs are equipped with the embedded systems and ready to randomly, recent IPS has sophisticated algorithm to
talk IPv4/IPv6 protocols. Power grid researches have promoted detect this activity and can drop such scan packets.
the network connection of such embedded systems [15]. For 2) Slow scan to search a specific product.
example, an electric car is connected to power line of a house. If an attacker who tries to attack a certain organization
In case of an emergency, its battery can be used as power knows undisclosed vulnerabilities of a specific product,
source for the house. With taking into consideration on such e.g., a digital TV set, he may be interested in the
advancement, recent targets of attackers have been shifted lowest 16 bits of IPv6 addresses. Because he can easily
from conventional PCs to embedded systems. As one example, obtain the vendor code and the model identifier of the
167
510
%
2) Databases: Databases store the information that is used
in selecting an appropriate honeypot and assigning an IPv6
address to it. We defined three databases described as follows.
• Machine management database
䞉䞉䞉
&
%'
It manages the information of real machines and VMs in
the network segment. A real machine indicates a machine
which is not a honeypot. The entry of each machine is
Fig. 2. honeypot system structure
manually registered before the proposed honeypot system
is started. TABLE I shows the attributes and explanation
product. However, he may not have prior knowledge of this database. id represents a machine identifier. ma-
which network segment his targets exists. Every time chine represents an operating system name installed to the
when he sends one scan packet, he selects one subnet machine. mac represents a MAC address of the machine.
identifier of the organization and one serial identifier of addr represents an IPv6 address that statically assigned
MAC addresses randomly. to the machine.
3) Slow scan for a pinpoint attack to a target device. TABLE I
If an attacker has a clear target person in an organization, ATTRIBUTES OF MACHINE MANAGEMENT DATABASE
he may be eager to know the MAC address of personal id machine identifier
devices like a smart phone. After he is succeeded to machine machine name
obtain the MAC address, he never needs to search the mac MAC address of machine
addr static IPv6 address
interface identifier of the IPv6 address. Only subnet
identifier attracts his interest. Similar to 2), he selects a
subnet identifier randomly but fixes interface identifier. • Address management database
As results, it is quite difficult for IPS to detect such slow It manages current or past status of each IPv6 address.
and random walk scan. TABLE II shows the attributes and explanation of this
Compared with the number of grobal routing prefix, that of database. addr represents an address that was accessed.
allocated prefixes [18] are very small, i.e., 248 v.s. 230 . If an id represents a machine identifier that responds past
attacker want to find out someone’s device in the world by accesses. chk is decided from past accesses and it is used
using a botnet, his task does not becomes so difficult. which honeypot the address is assigned is decided. chk
has five parameters of success, failure, run, yet
III. P ROPOSED M ETHOD and in-use. success represents that an attack was
executed to machine with id when addr was assigned to.
We propose a system that monitor an access to an unas-
This means that the attack is not aborted in step 3 of
signed address based on EUI-64 and assigns an address to the
Section II-B and an attacker proceeds step 4. failure
appropriate honeypot dynamically.
represents that an attack was aborted. run represents that
A. Architecture of proposed system the addr is assigned to machine with the id at that time.
yet represents that the addr have not be assigned to
The proposed honeypot system is constructed from multiple machine with the id. in-use represents that the addr is
honeypots, three databases and an address assigning manager. assigned to the machine that is not honeypot.
Fig. 2 shows the overview.
1) Honeypots: In order to cope with the problem, we TABLE II
generated various types of honeypot as virtual images and ATTRIBUTES OF ADDRESS MANAGEMENT DATABASE
install the images into the system in advance. By using a addr IPv6 address
virtual honeypot and applying a snapshot command, we can id machine identifier
chk state of entry
easily obtain compromised disk image of the honeypot. Also
by restoring original image of the honeypot, the honeypot
immediately returns to clean status. For each combination of • MAC address database
OS and applications, an image of a honeypot is prepared. This database stores vendor codes, model identifiers and
Every time when an access to an unassigned IPv6 address a machine identifiers. The vendor code is published by
is detected, our system infers attacker’s target type and select IEEE. The 256 entries exist for each vendor code because
the most appropriate honeypot among the images. a model identifier is a 8-bit number. When our system
Currently we deploy conventional OSs, e.g., Windows, finds an attack that aims at IPv6 addresses derived from
Linux, as virtual machines. In addition, we are planning to MAC addresses, the database is referred and its content is
develop virtual machines which emulate embedded systems updated. TABLE III shows the attributes and explanation
such as home-appliance, sensors. Since behavior of the embed- of this database. vendor represents a vendor code and
ded systems is simple compared to the conventional OSs, it is model represents a model identifier described in Section
feasible to develop the system applying SGNET like methods. II-D. id represents a machine identifier of the honeypot.
168
511
TABLE III %'
ATTRIBUTES OF MAC ADDRESS DATABASE
%'
%'
,.'
169
512
The address assigning manager makes and sends a NA 1) Grouping of an IPv6 address: We classify IPv6 ad-
message which corresponds to the response to the NS dresses to three groups, human friendly, EUI-64 derived and
message that is detected in process 1. random/unknown, according to its interface identifier.
The packet that reached to the router before process 1 is sent to • Human friendly addresses
the honeypot that is assigned the accessed address after process According to [19], network administrators tend to select
5. The honeypot becomes possible to begin to communicate the interface identifier of an IPv6 address with one of the
with attacker. following patterns.
3) Management of addresses used in network segment: – “low-byte” addresses
The address of real machines in network segment can be re- All bytes of the interface identifier except the lowest
configured based on router advertisement. If the address is not one are set to 0 (as in 2001:db8::3).
registered to the address management database, the address can – IPv4-based addresses
be mistakenly assigned to a honeypot. To prevent this situation, The interface identifier encodes the IPv4-address of
when a DAD packet is detected, the address assigning manager the network interface (as in 2001:db8::192:168:1:1).
regards that the address is used and registers it to the address – Wordy addresses
management database with chk=in-use not to be assigned The interface identifier determined from encode
to honeypots. We assume that DHCPv6 is disabled on the words (as in 2001:db8::dead:beef). We regard “ace”,
network segment in this paper. “add”, “bad”, “bee”, “beef”, “bed”, “bead”, “cab”,
4) Collect reconnaissance accesses: The accesses to the “cafe”, “dad”, “dead”, “face”, “fade”, “fee” and
unassigned address are regarded as reconnaissance one. We “feed” as encode words.
judge the attack have been executed when the count of serial
communication exceeds the threshold. Otherwise, we assume The first and second patterns of above contain many 0s in
that the attack is aborted by the attacker because there is no the interface identifier. If the half or more of hexadecimals
vulnerability that the attacker targets or the attacker notices the of an interface identifier are 0 or the interface identifier
target is a honeypot. Currently the threshold is a fixed value determined from encode words, the proposed method
and it is defined beforehand. classifies the IPv6 address to human friendly addresses.
When an attack is continued, the attack is classified into the • Addresses derived from EUI-64
following three patterns. IPv6 addresses with an interface identifier derived from
EUI-64 format are classified into this group. The vendor
• The honeypot has returned responses that look like the
and the model of the device are identified from the
target device one.
interface identifier as described in Section II-C.
• The attacker has not comprehended the response of the
• random/unknown addresses
target device yet.
IPv6 addresses except above two groups are classified
• The attacker does not search specific target devices.
into this group. It is assumed that the interface identifier
Various information is obtained by analyzing the attack even is generated from Privacy Extensions [14] or random
if it is which pattern. For example, there is a possibility values or unknown algorithms. They are out of our scope
that malwares can be collected in the first and the second because it is difficult to expect the relation between
pattern. The malware is assumed that it targets the device with packets with such destination address.
same vendor code and same model identifier. The unknown
vulnerabilities of the device and countermeasures against the 2) Strategy for the accesses to human friendly addresses:
attack can be obtained by analyzing the malware. Compared with other groups, there is tendency that a human
If the communication continues and if the number of its friendly address is assigned to a specific purpose device, e.g.,
packets exceeds the threshold, the address assigning manager server, router. Therefore, when the proposed method detects
updates chk of corresponding entry of address management the NS packet to the address of the group, it assigns the address
database to success and the disk image of the honeypot to a honeypot. If our honeypot responds to all scan packet
is stored to be able to analyze it later, then the honeypot is targeted /64 network, this behavior is unnatural and an attacker
rebooted by using a clean disk image. A clean disk image may suspect the existence of the honeypot. In order to avoid
is a copy of disk image of a honeypot before it is attacked. such situation, our method assigns X IPv6 addresses at most.
Otherwise, chk is updated to failure and the IPv6 address X represents the maximum number of assignable IPv6 address
is released from the honeypot. When chk of all entry related to for each segment and is determined based on environment of
the accessed address become to failure, they are changed each segment.
to yet again. 3) Strategy for slow scan to addresses derived from EUI-64:
As described in Section II-E, most attackers adopt the slow
scan. In order to detect such scan, our honeypot systems should
C. Strategies of address assignment be deployed in various network segments and the systems
The proposed method changes the strategy of address as- should collaborate with each others. When one honeypot
signment according to the interface identifier of the accessed system observes an access to unassigned IPv6 address, the
IPv6 address and the type of scan described in SectionII-E. system sends a message to all systems. By examining the
170
513
messages from honeypot systems in other network segments,
each hoenypot system may find out that destination address
have the same vendor code and the model identifier. In this
case, our system regards the activity as the slow scan. 33 444''''446! 33 4445555446!
171
514
TABLE VI VMs that observe behaviors of embedded system and emulate
R ESPONSE TIME OF PING
it to our system. We will evaluate the effectiveness of strategies
OS average time (s) std dev (s) worst time (s) for slow scans in real network.
Ubuntu 10.04 1.04 0.04 1.12
CentOS 6.0 1.24 0.23 2.02 R EFERENCES
Windows Vista SP2 2.12 0.14 2.41 [1] Available Pool of Unallocated IPv4 Internet Addresses Now Completely
TABLE VII Emptied, http://www.icann.org/en/news/releases/release-03feb11-en.pdf.
T IME TAKEN IN EACH PROCESS [2] Y. Gelogo, S. Jeon, and T. Kim, “Ipv6 mobile sensor network architec-
ture for scada system,” in Computational Intelligence and Communica-
Ubuntu10.04 CentOS 6.0 Windows Vista tion Networks (CICN), 2011 International Conference on. IEEE, 2011,
process 3 (ms) 9.11 pp. 485–488.
process 4 (ms) 192 262 1244 [3] S. Suriadi, A. Tickle, E. Ahmed, J. Smith, and H. Morarji, “Risk mod-
process 5 (ms) 3.22 elling the transition of scada system to ipv6,” What Kind of Information
Society? Governance, Virtuality, Surveillance, Sustainability, Resilience,
pp. 384–395, 2010.
[4] IPv6 Embedded Systems and 6LoWPAN Sencor Networks,
Even if we took the situation into account, there still http://www.rmv6tf.org/2010-IPv6-Summit-Presentations/RMv6TF-
Sensor-Network-Preso - Chuck Sellers.pdf.
existed significant gaps among values of TABLEs VI and VII. [5] RFC 5157 IPv6 Implications for Network Scanning,
Therefore we investigated its reason and found that there was http://tools.ietf.org/html/rfc5157.
blank time on Address assignment manager between processes [6] RFC 4861 Neighbor Discovery for IP Version 6 (IPv6),
http://tools.ietf.org/html/rfc4861.
4 and 5. [7] L. Spizner, Definitions and Vlaue of Honeypots,
After process 4 completed, the manager logged into a hon- http://www.trackinghackers.com/papers/honeypots.html.
eypot and set up IPv6 address. Because the required time for [8] J. Song, H. Takakura, and Y. Okabe, “Cooperation of intelligent hon-
eypots to detect unknown malicious codes,” in WOMBAT Workshop on
the process was obtained by Host machine, its measurement Information Security Threats Data Collection and Sharing. IEEE, 2008,
was terminated when the manager issued ifconfig/ipconfig pp. 31–39.
command. Therefore we performed an additional experiment [9] M. Vrable, J. Ma, J. Chen, D. Moore, E. Vandekieft, A. Snoeren,
G. Voelker, and S. Savage, “Scalability, fidelity, and containment in
to measure time of completion of the command and identified the potemkin virtual honeyfarm,” in ACM SIGOPS Operating Systems
that it took 0.9 seconds on average. Review, vol. 39, no. 5. ACM, 2005, pp. 148–162.
Furthermore, we should take into account of timeout of [10] SGNET, http://wombat-project.eu/WP3/FP7-ICT-216026-
Wombat WP3 D13 V01-Sensor-deployment.pdf.
3-way handshake to establish the TCP connection. Its value [11] R. Beverly, Opportunistic IPv6 Insight via Abusive Traffic,
depends on each OS implementation. From our analysis [22], http://www.caida.org/workshops/isma/1202/slides/aims1202 rbeverly.pdf.
the time was 6 seconds at least. This result means that our [12] K. Steding-Jessen, Development of an IPv6 Honeypot,
http://www.cert.br/docs/palestras/certbr-ipv6-national-csirts-
system can establish TCP connections without dropping any meeting2009.pdf.
packet and continue communication with attackers. Therefore, [13] Guidelines for 64-bit Global Identifier (EUI-64) Registration Authority,
we confirmed the feasibility of the proposed method. http://standards.ieee.org/develop/regauth/tut/eui64.pdf.
[14] RFC 4941 Privacy Extensions for Stateless Address Autoconfiguration
in IPv6, http://tools.ietf.org/html/rfc4941.
V. C ONCLUSIONS [15] T. Miyamoto, Y. Koyama, K. Sakai, and Y. Okabe, “A gmpls-based
In this paper, we have pointed out the obstructions to deploy power resource reservation system toward energy-on-demand home
networking,” in Applications and the Internet (SAINT), 2012 IEEE/IPSJ
IPv6 honeypots and attacker’s strategies to scan IPv6 network. 12th International Symposium on. IEEE, 2012, pp. 138–147.
In IPv6, each network segment has vast address space and an [16] Cyberattacks on Iran – Stuxnet and Flame,
IPv6 address derived from EUI-64 format is dependent on the http://topics.nytimes.com/top/reference/timestopics/
subjects/c/computer malware/stuxnet/index.html.
model of the device. IPv6 attackers can target and discover [17] M. Dabbagh, A. Ghandour, K. Fawaz, W. Hajj, and H. Hajj, “Slow port
a specific device by performing scans address space derived scanning detection,” in Information Assurance and Security (IAS), 2011
from SLAAC. 7th International Conference on. IEEE, 2011, pp. 228–233.
[18] Measurement of IPv6 readiness, http://v6metric.jp/html/st01/12.html.
To deal with the vast address space of IPv6 and dependence [19] Network Reconnaissance in IPv6 Networks,
of IPv6 addresses, we have proposed a honeypot system whose http://tools.ietf.org/html/draft-gont-opsec-ipv6-host-scanning-01.
IPv6 address is derived from an attack. Our system identifies [20] F. Bellard, “Qemu, a fast and portable dynamic translator,”
in Proceedings of the annual conference on USENIX Annual
an attacked address by detecting a NS packet that is multi- Technical Conference, ser. ATEC ’05. Berkeley, CA, USA:
casted from a router. The address is assigned to a honeypot USENIX Association, 2005, pp. 41–46. [Online]. Available:
that emulates behaviors of an appropriate device before the http://dl.acm.org/citation.cfm?id=1247360.1247401
[21] A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori, “kvm: the
communication with attackers. To cope with slow scans, we linux virtual machine monitor,” in Proceedings of the Linux Symposium,
have proposed strategies that the systems collaborate with each vol. 1, 2007, pp. 225–230.
other by deploying them in various network segments. [22] N. Kitagawa, H. Takakura, and T. Suzuki, “An anti-spam method via
real-time retransmission detection,” 18th IEEE International Conference
The proposed method can complete the address assignment on Networks (to be appeared), 2012.
to be in time for the timeout of the TCP connection from
the experiment. We confirmed the feasibility of the proposed
method.
As of now, only VMs based on conventional OS are imple-
mented to the system. For future works, we will implement
172
515