tmp9C3B TMP

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Spectral Techniques for Graph Bisection in Genetic

Algorithms
Jacob G. Martin
University of Georgia
Computer Science
Athens, GA, 30601, USA

[email protected]
ABSTRACT

36]. Although these domains are quite dierent in some aspects, each can be reduced to the problem of ascertaining
or ranking relevance in data. Intuitively, the concept of relevance depends critically on the nature of the problem at
hand. SVD provides a method for mathematically discovering correlations within data. The focus of this work is to
investigate several possible methods of using SVD in a genetic algorithm to better solve the minimum graph bisection
problem.
SVD is useful when bisecting certain types of graphs. To
obtain a bisection of a graph, SVD is performed directly on
the 0,1 adjacency matrix of the graph to be bisected. Next,
an eigenvector is chosen and its components are partitioned
based on the median of all of the components. Given that
each component of an eigenvector represents a vertex of the
graph, a partitioning of the graph is achieved. The process
of using eigenvectors to bisection graphs is called spectral
bisection. The techniques roots stem from the works of
Fiedler [19], who studied the properties of the second smallest eigenvector of the Laplacian of a graph, and Donath and
Homan [15], who proved a lower bound on the size of the
minimum bisection of the graph.
In addition to applying SVD directly to graphs, it is also
used in several ways to guide the search process of a Genetic Algorithm (GA). SVD helps guide the search process of
the GA by identifying the most striking similarities between
genes in the most highly t individuals of the optimization
history. The GAs mutation operator is then restricted to
only modify the locus of the genes corresponding to these
striking similarities. In addition, individuals are engineered
out of the discovered similarities between genes across highly
t individuals. The genes are also reordered on a chromosome to group similar genes closer together on a chromosome. The heuristics show remarkable performance improvements. In addition, the performance achieved is magnied when the heuristics are combined with each other. As
further evidence for the applicability of these new heuristics,
several world record minimum bisections have been obtained
from the genetic algorithm described in this paper.
The rst section gives background information on the graph
bisection problem, genetic algorithms, and SVD. The second
section discusses the implementation details for the genetic
algorithm. Section three describes the spectral heuristics
that augment the standard GA. The fourth section gives
experimental evidence for the applicability of the operators
described. The last two sections provide future research
ideas and a summary of the results.

Various applications of spectral techniques for enhancing


graph bisection in genetic algorithms are investigated. Several enhancements to a genetic algorithm for graph bisection
are introduced based on spectral decompositions of adjacency matrices of graphs and subpopulation matrices. First,
the spectral decompositions give initial populations for the
genetic algorithm to start with. Next, spectral techniques
are used to engineer new individuals and reorder the schema
to strategically group certain sets of vertices together on the
chromosome. The operators and techniques are found to
be benecial when added to a plain genetic algorithm and
when used in conjunction with other local optimization techniques for graph bisection. In addition, several world record
minimum bisections have been obtained from the methods
described in this study.

Categories and Subject Descriptors


I.5.3 [Computing Methodologies]: Pattern RecognitionClustering[algorithms, similarity measures]

General Terms
Algorithms, Experimentation, Performance

Keywords
Genetic algorithm, singular value decomposition, graph bisection, graph partitioning, spectral bisection, genetic engineering, reduced rank approximation

1.

INTRODUCTION

The technique of singular value decomposition (SVD) has


proven itself valuable in several dierent problem domains:
data compression [17], image recognition and classication
[20], chemical reaction analysis [41], document comparison
[14, 7], cryptanalysis [39, 46], and genetic algorithms [37,

Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
GECCO06, July 812, 2006, Seattle, Washington, USA.
Copyright 2006 ACM 1-59593-186-4/06/0007 ...$5.00.

1249

2.

BACKGROUND

2.1
2.1.1

is the size of the chromosome. A schema is a pattern of


genes consisting of a subset of genes at certain gene positions. If n is the size of a chromosome, a schema is an
ntuple {s1 , s2 , , sn } where i, si {0, 1, }. Positions in
the schema that have a  symbol correspond to dontcare
positions. The non- symbols are called specic symbols,
and represent the dening values of a schema. The number
of specic symbols in a schema is called the order, and the
length between the rst and last specic symbols in a schema
is called the dening length of the schema. The schema theorem implies that the smaller the order of a schema, the
more copies it will have in the next generation.
Although genetic algorithms do not specically work with
schemata themselves, schemata are a fundamental concept
when analyzing the exploratory process of a genetic algorithm. According to the building block hypothesis [24, 28],
GAs implicitly favor loworder, highquality schemata. Furthermore, as evolution progresses, the GA creates higher order, highquality schemata out of loworder schemata. This
is partially due to the nature of the crossover operator. The
repercussions of this behavior are impetus for the schema
reordering algorithms presented in Section 3.4.2.

Minimum Graph Bisection


Problem Statement

A bisection of a graph G = (V, E) with an even number of


vertices is a pair of disjoint subsets V1 , V2 V of equal size.
The cost of a bisection is the number of edges (a, b) E
such that a V1 and b V2 . The minimum graph bisection
problem takes as input a graph G with an even number of
vertices, and returns a bisection of minimum cost.
The minimum graph bisection problem arises in many important scientic problems. Several examples include the
splitting of data structures between processors for parallel
computation, the placement of circuit elements in engineering design, and the ordering of sparse matrix computations
[11]. The minimum graph bisection problem has been shown
to be NPComplete [22], making it a prime candidate for research and study.

2.1.2

Literature Review

Many heuristics have been developed for this problem.


Frieze and McDiarmid provide an analysis of the performance of algorithms on random graphs [21]. Perhaps the
best known heuristic is the KernighanLin heuristic [32, 10].
The KernighanLin heuristic has a time complexity of O(n3 )
and is P Complete [45, 27]. Fiduccia and Mattheyses gave a
simplication of the KernighanLin heuristic that has time
complexity (E) [18]. The eciency is gained by sorting
data using a method called the bucket sort. A simulated
annealing approach is used by Johnson et al. [29]. Spectral
techniques for graph bisection are motivated by the work of
Fiedler [19]. Indeed, spectral techniques are often used to
enhance graph algorithms [1, 43, 2, 5]. Donath and Homan
are among the rst to suggest using spectral techniques for
graph partitioning [15]. Alpert and Yao showed that more
eigenvectors may help improve results [3]. Their main result
showed that when all eigenvectors are used, the mincut
graph partitioning and max-sum vector partitioning problems objectives are identical. Graph partitioning with genetic algorithms has been studied extensively [35, 12, 33,
48, 48]. Most GA methods incorporate other algorithms
and heuristics, such as spectral partitioning or Kernighan
Lin. Singular value decomposition has also proved to be a
useful tool when clustering graphs [16, 31]. However, this
paper contains one of the rst attempts to combine these
results, providing strategies for using singular value decomposition in a genetic algorithm for the minimum graph bisection problem.

2.2
2.2.1

2.3

Singular Value Decomposition

Theorem 1. Let A be an m n real matrix with rank r.


Then there exists an m n diagonal matrix

D 0
(1)
=
0 0
where the diagonal entries of D are the rst r singular values
of A, 1 2 r > 0, and there exist an m m
orthogonal matrix U and an n n orthogonal matrix V such
that
A = U V T

(2)

The existence and theory of SVD is established by several


mathematicians: Beltrami, Jordan, Sylvester, and Schmidt[47].
Stewart provides an excellent survey of the history of discoveries that lead to the theory of the SVD. [49].

2.3.1

Summary

As Theorem 1 states, SVD expresses an m n matrix


A as the product of three matrices, U , , and V T . The
matrix U is an m x m matrix whose rst r columns, ui
(1 i r), are the orthonormal eigenvectors that span
the space corresponding to the row auto-correlation matrix
AAT . The last m r columns of U form an orthonormal
basis for the left nullspace of A. Likewise, V is an n x
n matrix whose rst r columns, vi (1 i r), are the
orthonormal eigenvectors that span the space corresponding
to the column auto-correlation matrix AT A. The last n r
columns of V form an orthonormal basis for the nullspace
of A. The middle matrix, , is an m x n diagonal matrix
with ij = 0 for i = j and ii = i 0 for i . The i s are
called the singular values and are arranged in descending
order with 1 2 n 0. The singular values are
dened as the square roots of the eigenvalues of AAT and
AT A. The SVD can equivalently be expressed as a sum of
rank one matrices

Genetic Algorithms
Background and Terminology

Genetic Algorithms (GAs) are search and optimization


methods that mimic natural selection and biological evolution to solve optimization and decision problems. The book
by David Goldberg [24] provides a thorough introduction to
the eld of Genetic Algorithms. A brief overview of genetic
algorithms and some denitions of terminology follow.
A chromosome is a sequence of gene values. In this paper, each gene will usually have a value of either a zero or
one. A potential solution to a problem is represented by a
chromosome. For graph problems, the number of vertices

r=rank(A)

A=

i=1

1250

i ui viT

(3)

spectral_injection;
do {
reorder_schema;
restrict_space;
for i from 1 to 100 do
choose parent1 and parent2 from population;
child = crossover(parent1, parent2);
mutate(child);
modified_kernighan_lin(child);
children.add(child);
end for;
replace(population, children);
engineered = engineer(population);
replace(population, engineered);
} until(stopping condition)

The ui s and vi s are the columns of U and V respectively.


Using the Golub-Reinsch algorithm [25, 23], U , , and V
can be calculated for an m by n matrix in time O(m2 n +
mn2 + n3 ).

2.3.2

Reduced Rank Approximations

The magnitudes of the singular values indicate the weight


of a dimension. To obtain an approximation of A, all but the
k < r largest singular values in the decomposition are set to
zero. This results in the formation of a new low-dimensional
matrix Ak , of rank k, corresponding to the k most inuential
dimensions.
Ak = Uk k VkT =

k
X

i ui viT

(4)

i=1

Here, Uk and Vk are the matrices formed by keeping only


the eigenvectors in U and V corresponding to the k largest
singular values. Eckart and Youngs paper is a rediscovery
of this property, rst proved by Schmidt [47].

3.

Figure 1: The Hybrid Genetic Algorithm

locally around the solutions values. Hybrid GAs are a hybridization of a genetic algorithm with a local search heuristic that is tailored specically for solving a certain problem.
Generally, the performance of the local improvement heuristic is compromised to give a lower time complexity when
creating a hybrid GA. This ensures that the local improvement heuristic does not overwhelm the overall running time
of the GA.
The implemented GA uses a trimmed down variant of the
KernighanLin [32] optimization algorithm. The traditional
KernighanLin heuristic has a time complexity of O(n3 ) and
is not guaranteed to provide the minimum bisection. The
algorithms time complexity is trimmed down in the exact
way that is described in Bui and Moons paper on graph
partitioning with a GA [12].
Additionally, the data structures and implementation of
the algorithm are done in constant time by using the methods of Fiduccia and Mattheyses [18]. Fiduccia and Mattheyses gave a simplication of the KernighanLin heuristic that
has time complexity (E) [18]. The eciency is gained by
sorting vertex gains using a method called the bucket sort.
The addition of the FiducciaMattheyses technique grants
the ability to perform a limited, low cost, local search when
solving various graph bisection problems.

IMPLEMENTATION DETAILS

Individuals are represented in binary in the following manner. If the ith component of an individual is one, then the
ith vertex is placed in the set V1 . Otherwise, if the ith component of an individual is zero, then the ith vertex is put in
the set V2 . Notice that individuals are symmetrical in this
representation. That is, ipping every bit in one solution
gives the exact same bisection. Before the GA starts, the
ordering of the vertices is permuted to prevent the results
from containing any possible bias on the ordering of the input. Tests are performed using a custom GA, implemented
entirely in JavaT M . The SVD is computed using LAPACK
routines and the Matrix Toolkits for JavaT M (MTJ).

3.1

The Genetic Algorithm

An approach similar to the ( + ) evolution strategy is


used, with populations of size 100 generating 100 candidate
individuals. Reinsertion is achieved by picking the best 100
individuals out of the 200 total parents and children. The
results correspond to the average of the best individual at
each generation, over 100 dierent random initial populations. Let f (x) be the value of the function that is being
optimized when applied to an individual x. The log tness
of an individual is dened as
1
logf itness(x) = ln
0
(5)
1 + |f (x) target|

3.3

The mutation rate is set at 12%. A modied mutation


method of switching two random genes is implemented to
keep the number of ones and zeroes in an individual equal.
In the case of subproblem evolution, a gene from the subproblem area is ipped and an opposite gene from the non
subproblem area is also ipped. In plain GAs, the mutation
operator simply exchanges the values of two opposite genes.
The crossover operator is duplicated from an earlier paper
about graph bisection with genetic algorithms by Bui and
Moon [12]. The crossover operator is a modied 5-point
crossover that considers the symmetric nature of chromosomes in the graph bisection problem. After mutation and
crossover, repair operators are utilized to repair the resulting partitions that are not perfectly balanced. The repair
process is also implemented in the same manner as Bui and
Moon [12].

In this tness function, the function value f (x) approaches


its target (for example the minimum) as the tness function
approaches zero. Individuals with higher tness represent
better solutions than those with lower tness. An individual
with a tness equal to zero is an exact solution because only
then will f (x) = target.
A pseudocode listing of the genetic algorithm appears in
Figure 1. An explanation of each individual function appears in Sections 3.2, 3.3, and 3.4.

3.2

Genetic Operators

Local Improvements

Hybrid GAs are those that incorporate a local search procedure during each generation on the new ospring. Local
searches are almost always problem specic. Their goal is
to improve a candidate solution to a problem by exploring

1251

INPUT = Adjacency Matrix A


OUTPUT = Partition List P

ing ecient algorithms for solving various intractable graph


problems.
The relationships between the spectrum of a graph (which
are the eigenvalues of its adjacency matrix) and the properties of the graph itself have been popular topics for research
and discovery in the last fty years [13]. The spectrum
has been used to help solve the problem of graph isomorphism [50]. Certain eigenvectors of adjacency matrices in
several representations sometimes tend to partition its corresponding graph into two halves such that the conductance
of the parts is high, but the conductance between parts is
low. Eigenvectors have been used to nd good minimum
cut partitions and to nd good colorings for graphs [8, 5, 4].
However, most studies usually only focus on one eigenvector
of one representation type for the adjacency matrix. This
eigenvector is called the Fiedler vector, and corresponds to
the second smallest eigenvalue of the Laplacian [19]. In the
context of this paper, spectral bisection is performed on every eigenvector of the adjacency matrix of the graph. The
best 100 resulting partitions are then used to seed the genetic algorithms rst population.

1. Compute all of the eigenvectors of the input matrix A.


2. For each eigenvector, compute the median of its components and place vertex i in the rst partition if the
ith component of the eigenvector is less than or equal
to the median. Otherwise, place vertex i in the second
partition.
3. If necessary, repair the partition to make the number of
vertices equal by moving vertices from the bigger partition to the smaller partition until the number of nodes
in each partition is equal. Start with nodes that are
closer to the other partition in terms of their corresponding eigenvectors component.
4. Add the resulting partition to the list of all partitions
to return, P .
Figure 2: Algorithm for Spectral Bisection

3.4

SVD Incorporation

3.4.2

The goal is to discover the genes that are used similarly


across the best individuals. The ideas to be presented next
can be generalized to other methods of determining similarly
used genes. However, SVD yields accurate identication of
subproblems in optimization problems whose solutions have
a block representation [42]. The SVD of a matrix containing the best few (5) individuals in the entire optimization
history is computed. Instead of aiming for the sole ttest individual, the GA used SVD to decompose the best few ttest
individuals and therefore directed the search towards a combination of the best individuals. The computational complexity of computing the SVD may outweigh the complexity
of the problem being solved. However, problems with a computationally expensive tness function may benet from the
methods to be described. In particular, if complex problems
can be decomposed into smaller and simpler subproblems,
then the benet will outweigh the cost of computing the
SVD. Several time optimizations can also be made to decrease the amount of time used computing the SVD. For
example, existing SVDs can be updated using special algorithms for adding or removing rows and columns [6]. Also,
random projections are a fast alternative to SVD [42].

3.4.1

Schema Reordering

Due to the nature of the problems addressed, good schema


are apt to be destroyed during crossover if the locations
forming the schema are scattered apart on the chromosome.
To combat the disruptive nature of crossover, chromosomes
are reordered to group the similar genes closer together on a
chromosome. This helps to create higherquality schemata
with shorter dening lengths. SVD denes the reordering
at every generation during optimization. The reordering
groups similar genes together, allowing the GA to benet
from the building block hypothesis. This is in contrast to
a strategy that only performs an initial schema preprocessing once before the GA for the minimum graph bisection
problem starts [12]. It should be noted that this schema
reordering technique aects the dening length, but not the
order of the schema.
As the building block hypothesis suggests, the computational power of genetic algorithms largely comes from manipulating the solutions of subproblems, i.e., building blocks.
Hence, identifying subproblems has been a center of many
subelds within genetic and evolutionary computation. Three
examples of related elds that should be studied to better
connect the use of SVD to current GA research are Linkage
Learning [26], Probabilistic Model Building Genetic Algorithms [44], and Learnable Evolution Models [38].

Spectral Injection

The technique of spectral bisection provides initial population seedings for the genetic algorithm. Initially, the SVD
of the adjacency matrix of the graph to be bisected is computed. All bisections are created using the algorithm in
Figure 2. The best spectrally found bisections are initially
injected into the population to inuence the GA towards
good bisections. Experiments with this method show that
spectral injection gives the GA a tremendous head start in
comparison to not using it at all. The motivation for using
spectral partitioning is that the eigenvalues and eigenvectors of many types of adjacency matrices have been shown
to have many relationships to properties of graphs. Moreover, every eigenvalue and eigenvector of a matrix can be
computed eciently in polynomial time. Therefore, eigenvalues and eigenvectors are prime candidates for construct-

3.4.3

Restricted Mutation

The mutation operator is restricted to a strategically chosen subset of the genes. This isolates the search process to
the genes in highly t solutions, facilitating the determination of the local optimum. The subset of genes is chosen by
using a SVD process described in a previous paper by Martin[36]. The subset of genes is chosen randomly from the set
of all sets of highly correlated genes identied by SVD. The
restriction only happened every other generation. This allows the mutation operators to have full access to the entire
space of possible chromosomes.

1252

3.4.4

Genetic Engineering

for the SVD methods. Unless otherwise indicated, the plain


GA is augmented with local search. In some cases, the spectral injection heuristics discussed earlier are also included in
the plain GA.

A genetic engineering approach is tested at every generation. First, the rank2 SVD of 5 to 10 proportionally selected individuals is computed. The number of individuals
is chosen uniformly at random. Then, using a process similar to that described in a previous paper [36], a new graph
of correlated genes is generated. Specically, the magnitude
of the (i, j) entry in unit scaled matrix A2 AT2 determines
if an edge appears between vertex i and vertex j in the
new graph. If the entry is bigger than 0.9, an edge is created. The vertices in the new graph represent represent the
original graphs vertices but are instead connected to those
vertices that the top 5 to 10 best individuals collectively
believe should be clustered into the same side of the bisection. Ironically, a minimum bisection of the new graph gives
a good approximation of the combination of the best individual minimum bisections in the original graph. To keep
the problem from becoming self referentially intractable, an
approximate minimum bisection of the new graph is discovered by running only one iteration of full KernighanLin on
a randomly generated individual. If better, the newly generated individual replaces the worst individual in the current
population.

3.4.5

4.1
4.1.1

Graph Types

Geometric and caterpillar graphs are studied and used as


the basis of experiment. A description of the notation and
construction details of each type of graph follows. Experiments on other types of graphs (random, grid, path, and
highly clustered) give similar results but are not included in
this paper due to space restrictions.
1. Random Geometric Graphs Un.d : A graph on
n vertices created by associating n vertices with different locations on the unit square. The unit square
is located in the rst quadrant of the Cartesian Plane.
Therefore, each vertexs location is represented by a
pair (x, y)  for some 0 x, y 1. An edge is created between two vertices if and only if the Euclidean
distance between the two is d or less. These graphs are
dened and tested in the simulated annealing study by
Johnson et al. [29].

Low Rank Approximations

Two forms of the SVD are tested. The rst is the full
rank version of the SVD. The second is based on the reduced rank version, where all but the rst k largest singular
values are set to zero, giving Ak . As expected, the reduced
rank strategies generally discover the subproblems more eciently than the full rank versions. This is due in part to the
theoretical results mentioned in the probabilistic analysis of
reduced rank spectral clustering in a well known paper by
Papadimitriou et al. [42]. The performance may also have
improved because, in the application domains tested, the
GA is only seeking one block in the solution space. Reduction to a lower rank correctly directs the search towards
the correct block because a lower value of k in Ak increases
the cosines of the angles between vectors of similar types [9].
Another reason may be that in comparison with higher rank
reductions, lower rank reductions are less restrictive and will
identify larger subsets of related genes as the rank is reduced.
Therefore, lower rank reductions allow the restrictive mutation and crossover operators to have more freedom during
exploration. However, lowering the rank too much may not
always increase the performance because all genes will be
seen as similar to all other genes.

4.

Minimum Graph Bisection

2. Caterpillar Graphs CATn : A caterpillar graph


on n vertices. Two of the vertices are the head and
tail of the caterpillar. Next, (n2)

vertices are cho7


sen to represent the discs in the spine of the caterpillar.
To each of these vertices is then attached 6 legs from
the remaining (n 2) (n2)

vertices. The cater7


pillar graphs considered here have an even number of
discs in their spine. This implies that the only possible
caterpillars have an even number of vertices with
n {(i 6 + i) + 2 : i 2, i mod 2 = 0}
= {16, 32, 44, , 352, }
Here, i represents the total number of discs on the
spine. Caterpillar graphs have been shown to be very
dicult for standard graph bisection algorithms such
as KernighanLin [30, 12]. In addition, the minimum
bandwidth problem for caterpillars with hair length 3
is NPComplete [40].

4.1.2

Discussion

The SVD engineering technique is similar in function to


a voting scheme. Evidence for this is provided in Figure 3.
The voting technique takes the top 5 to 10 proportionally
selected individuals and calculates a vote for which side of
the bisection each vertex should belong. Before the vote is
counted, every bit in an individuals representation is ipped
if and only if the rst bit is zero. This helps account for
the symmetrical nature of candidate solutions for the graph
bisection problem. Next, the vote is taken and a new individual is engineered from the resulting votes. It can be seen
from Figure 3 that the cut solution qualities of the generated individuals are similar, but that the SVD engineering
performs better. In addition, the average generated cut size
increases rapidly in the rst 10 generations, and then rapidly
decreases. This indicates that the GA needs some time to
discover good basis individuals for engineering. Finally, the
hypothesis that the SVD engineering technique acts as a

EMPIRICAL RESULTS

Intuitively, the number of generations it takes to nd a solution is the greatest factor in proving a genetic algorithms
performance. It is also illuminating to compare the average best individual at every generation. This allows one to
discover the convergence properties of a particular conguration of the GA. The results are based on average of the
best or average individual tness at each generation over 100
independent runs of the GA.
The GA is compared with various combinations of genetic
operators, local search functions, and techniques used for
solving the minimum graph bisection problem. To assess the
amount of benet achieved using the SVD heuristics, comparisons are made to a plain genetic algorithm that does not
use the SVD heuristics. The plain GA serves as a strawman

1253

U1000.05
1200
"SVD Engineer (rank=2) engineered cut size"
"SVD Engineer (rank=2) injection engineered cut size"
"Voting Engineer engineered cut size"
"Voting Engineer injection engineered cut size"

U1500.0.079
-7

1000

-7.5

-8

600
Fitness

Cut Size

800

400

-8.5

200
-9
"Engineering (rank=2) spectral
"Engineering Subproblem Rotation (rank=2) schema spectral
"Engineering Subproblem Rotation (rank=2) spectral
"Plain schema spectral
"Plain spectral

0
0

10

20

30

40

50
Generation

60

70

80

90

100
-9.5
0

Figure 3: Cut size comparisons between voting and


SVD engineering approaches with no local improvements for Buis U1000.05

50

100

150

200

250
Generation

300

350

400

450

500

Figure 4: Average best tness per generation when


using spectral injection and the modied KL approach on U1500.0797788

shared approximate vote is validated by the similarities between the optimization curves for voting and SVD.
Figure 4 depicts the results from an experiment that compares most of the described heuristics. In addition, local
searches are performed at each generation. Spectral injection, subproblem restriction and rotation[36], engineering,
and schema reordering are all veried to positively inuence the performance of the genetic algorithm separately
for this graph. Figure 5 shows that the performance increase is much more dramatic when the the local search operator is not performed. However, Figure 6 shows that when
the KernighanLin local improvement is used with graphs
for which KL does not perform well (caterpillars), the SVD
techniques outperform the plain GA by a more signicant
margin. This indicates that SVD may be a viable alternative to KL and that it can be successfully paired with KL
to provide additional performance.
In addition to the previous experiments, several record
size minimum bisections for real world graphs are found using the techniques described in this paper. The three graphs
for which record bisections are achieved are named data,
add20 (a 20 bit adder), and bcsstk33 (a statics module of a
pin boss). These results are listed in Chris Walshaws graph
partitioning archive located at http://staffweb.cms.gre.
ac.uk/~c.walshaw/partition/ [48].

5.

best"
best"
best"
best"
best"

U1000.20
5000
"Engineering Subproblem Rotation (rank=2) cut size"
"Engineering Subproblem Rotation (rank=2) schema cut size"
"Plain schema cut size"

4500
4000
3500

Cut Size

3000
2500
2000
1500
1000
500
0
0

10

20

30

40

50

60

70

80

90

100

Generation

FUTURE WORK

The positive benets of adding SVD to KL based algorithms have been explored in this paper. Analysis of variance (ANOVA) tests should be conducted to better prove
that the presented methods work well in combination with
each other. ANOVA tests should also be used to better isolate the benets of each operator.

Figure 5: Cut size results corresponding to no spectral injection and no local improvements for Buis
U1000.20

1254

mutation, and to dene schema reorderings based on approximations of highly t individuals. The new operators
and techniques are investigated with respect to their consequences on performance in conjunction with a hybridized
genetic algorithm employing the KernighanLin local search
operator and other operators described in previous research
papers [12, 36]. All of the introduced techniques are shown
to be benecial to the genetic algorithm. Empirical results
obtained from the combination and application of these new
heuristics are encouraging.

CaterpillarBisectionProblem128
-1.5

-2

Fitness

-2.5

7.

-3

-3.5

-4
"Engineering (rank=2) average.graph"
"Engineering (rank=2) schema average.graph"
"Plain average.graph"
"Plain schema average.graph"
-4.5
0

20

40

60

80

100
120
Generations

140

160

180

200

Figure 6: Average population tness per generation for using a modied KL local improvement on
CAT128
A graph bisection technique called Lock-Gain (LG) partitioning was recently introduced by Kim and Moon [34]. LG
partitioning extends KL by using a new tie-breaking strategy that intelligently selects the best highest gain vertex to
exchange during one pass of KL. In addition, the gain of
a vertex is calculated in a manner that takes into account
vertices that have already been moved. The various combinations of the SVD operators and techniques described
herein should be investigated in conjunction with the lock
gain partitioning method and other metaheuristics for minimum graph bisection.
Additional operators and procedures based on spectral
information should be considered. For example, a spectral
crossover operator can be used to give a linkage probability
to chromosomes that is related to the information provided
by the spectral decomposition of the adjacency matrix of
the graph to be bisected. This type of operator is justied
because many of the eigenvectors of several adjacency matrix
representations tend to group vertices together that should
be placed in the same partition. The distance between the
valuations for the vertices in the eigenvectors can be used
to determine the probability that two genes travel together
during crossover. Another example is the possibility of using
spectral information to enhance tiebreaking strategies in
LG and KL. Finally, the possible benets of starting with a
spectral population should be examined in detail.

6.

REFERENCES

[1] N. Alon. Spectral techniques in graph algorithms


(invited paper). In LATIN98: theoretical informatics
(Campinas, 1998), volume 1380 of Lecture Notes in
Comput. Sci., pages 206215. Springer, Berlin, 1998.
[2] C. J. Alpert, A. B. Kahng, and S.-Z. Yao. Spectral
partitioning with multiple eigenvectors. Discrete Appl.
Math., 90(1-3):326, 1999.
[3] C. J. Alpert and S.-Z. Yao. Spectral partitioning: The
more eigenvectors, the better. In DAC, pages 195200,
1995.
[4] B. Aspvall and J. R. Gilbert. Graph coloring using
eigenvalue decomposition. SIAM J. Algebraic Discrete
Methods, 5(4):526538, 1984.
[5] E. R. Barnes. An algorithm for partitioning the nodes
of a graph. SIAM J. Algebraic Discrete Methods,
3(4):541550, 1982.
[6] M. W. Berry, Z. Drmac, and E. R. Jessup. Matrices,
vector spaces, and information retrieval. SIAM Rev.,
41(2):335362 (electronic), 1999.
[7] M. W. Berry, S. T. Dumais, and G. W. OBrien.
Using linear algebra for intelligent information
retrieval. SIAM Rev., 37(4):573595, 1995.
[8] R. B. Boppana. Eigenvalues and graph bisection: An
average-case analysis (extended abstract). In FOCS,
pages 280285, 1987.
[9] M. Brand and K. Huang. A unifying theorem for
spectral embedding and clustering. In International
Workshop On Articial Intelligence and Statistics,
Jan. 2003. International Workshop On Articial
Intelligence and Statistics, January 2003.
[10] T. N. Bui, S. Chaudhuri, F. T. Leighton, and
M. Sipser. Graph bisection algorithms with good
average case behavior. Combinatorica, 7(2):171191,
1987.
[11] T. N. Bui and C. Jones. A heuristic for reducing ll-in
in sparse matrix factorization. In PPSC, pages
445452, 1993.
[12] T. N. Bui and B. R. Moon. Genetic algorithm and
graph partitioning. IEEE Trans. Comput.,
45(7):841855, 1996.
[13] D. M. Cvetkovic, M. Doob, and H. Sachs. Spectra of
graphs, volume 87 of Pure and Applied Mathematics.
Academic Press Inc. [Harcourt Brace Jovanovich
Publishers], New York, 1980.
[14] S. C. Deerwester, S. T. Dumais, T. K. Landauer,
G. W. Furnas, and R. A. Harshman. Indexing by
latent semantic analysis. Journal of the American
Society of Information Science, 41(6):391407, 1990.
[15] W. E. Donath and A. J. Homan. Lower bounds for
the partitioning of graphs. IBM J. Res. Develop.,

CONCLUSION

This paper presents several methods for enhancing the


performance of a genetic algorithm to better solve the minimum graph bisection problem. First, spectral techniques
are employed to seed the initial population with good solutions. SVD is also used to engineer, restrict the locus of

1255

17:420425, 1973.
[16] P. Drineas, A. M. Frieze, R. Kannan, S. Vempala, and
V. Vinay. Clustering large graphs via the singular
value decomposition. Machine Learning, 56(13):933,
2004.
[17] R. O. Duda and P. E. Hart. Pattern Classication and
Scene Analysis. Wiley, New York, 1972.
[18] C. M. Fiduccia and R. M. Mattheyses. A linear-time
heuristic for improving network partitions. In DAC
82: Proceedings of the 19th conference on Design
automation, pages 175181, Piscataway, NJ, USA,
1982. IEEE Press.
[19] M. Fiedler. Algebraic connectivity of graphs.
Czechoslovak Math. J., 23(98):298305, 1973.
[20] M. Flickner, H. Sawhney, W. Niblack, J. Ashley,
Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee,
D. Petkovic, D. Steele, and P. Yanker. Query by image
and video content: The QBIC system. IEEE
Computer, 28(9):2332, Sept. 1995.
[21] A. Frieze and C. McDiarmid. Algorithmic theory of
random graphs. Random Structures Algorithms,
10(12):542, 1997.
[22] M. R. Garey, D. S. Johnson, and L. Stockmeyer. Some
simplied NPcomplete graph problems. Theoret.
Comput. Sci., 1(3):237267, 1976.
[23] G.Golub and C. Reinsch. Handbook for Matrix
Computation II, Linear Algebra. SpringerVerlag,
1971.
[24] D. E. Goldberg. Genetic Algorithms in Search,
Optimization, and Machine Learning.
AddisonWesley, Reading, Mass., 1989.
[25] G. H. Golub and C. F. Van Loan. Matrix
computations. Johns Hopkins Studies in the
Mathematical Sciences. Johns Hopkins University
Press, Baltimore, MD, 1996.
[26] G. R. Harik and D. E. Goldberg. Learning linkage. In
Foundations of Genetic Algorithms, pages 247262,
1996.
[27] B. Hendrickson and R. W. Leland. A multi-level
algorithm for partitioning graphs. In Supercomputing
95: Proceedings of the 1995 ACM/IEEE conference
on Supercomputing, 1995.
[28] J. H. Holland. Adaptation in natural and articial
systems. University of Michigan Press, Ann Arbor,
Mich., 1975.
[29] D. S. Johnson, C. R. Aragon, L. A. McGeoch, and
C. Schevon. Optimization by simulated annealing: an
experimental evaluation. part i, graph partitioning.
Oper. Res., 37(6):865892, 1989.
[30] C. A. Jones. Vertex and edge partitions of graphs. PhD
thesis, Pennsylvania State University, University Park,
PA, USA, 1992.
[31] R. Kannan, S. Vempala, and A. Vetta. On clusterings:
good, bad and spectral. J. ACM, 51(3):497515
(electronic), 2004.
[32] B. Kernighan and S. Lin. An Ecient Heuristic
Procedure for Partitioning Graphs. Bell Systems
Journal, 49:291307, 1972.
[33] J.-P. Kim and B.-R. Moon. A hybrid genetic search
for multi-way graph partitioning based on direct
partitioning. In L. S. et al., editor, Proceedings of the

[34]
[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[43]

[44]

[45]

[46]

[47]
[48]

[49]
[50]

1256

Genetic and Evolutionary Computation Conference


(GECCO2001), pages 408415, San Francisco,
California, USA, 711 July 2001. Morgan Kaufmann.
Y.-H. Kim and B. R. Moon. Lock-gain based graph
partitioning. J. Heuristics, 10(1):3757, 2004.
H. S. Maini, K. G. Mehrotra, M. Mohan, and
S. Ranka. Genetic algorithms for graph partitioning
and incremental graph partitioning. Technical Report
CRPCTR94504, Center for Research on Parallel
Computation, Rice University, Houston, TX, 1994.
J. G. Martin. Subproblem optimization by gene
correlation with singular value decomposition. In
GECCO 05: Proceedings of the 2005 conference on
Genetic and evolutionary computation, pages
15071514, New York, NY, USA, 2005. ACM Press.
J. G. Martin and K. Rasheed. Using singular value
decomposition to improve a genetic algorithms
performance. In Proceedings of the 2003 Congress on
Evolutionary Computation CEC2003, pages
16121617, Canberra, 8-12 Dec. 2003. IEEE Press.
R. S. Michalski. Learnable evolution model:
Evolutionary processes guided by machine learning.
Mach. Learn., 38(12):940, 2000.
C. Moler and D. Morrison. Singular value analysis of
cryptograms. Amer. Math. Monthly, 90(2):7887,
1983.
B. Monien. The bandwidth minimization problem for
caterpillars with hair length 3 is NPcomplete. SIAM
J. Algebraic Discrete Methods, 7(4):505512, 1986.
B. Noble and J. W. Daniel. Applied Linear Algebra.
PrenticeHall, Englewood Clis, NJ, USA, third
edition, 1988.
C. H. Papadimitriou, P. Raghavan, H. Tamaki, and
S. Vempala. Latent semantic indexing: a probabilistic
analysis. J. Comput. System Sci., 61(2):217235, 2000.
A. Pothen, H. D. Simon, and K.-P. Liou. Partitioning
sparse matrices with eigenvectors of graphs. SIAM J.
Matrix Anal. Appl., 11(3):430452, 1990.
K. Sastry and D. E. Goldberg. Probabilistic model
building and competent genetic programming. In
R. L. Riolo and B. Worzel, editors, Genetic
Programming Theory and Practise, chapter 13, pages
205220. Kluwer, 2003.
J. E. Savage and M. G. Wloka. Parallelism in
graphpartitioning. J. Parallel Distrib. Comput.,
13(3):257272, 1991.
B. R. Schatz. Automated analysis of cryptogram
cipher equipment. CRYPTOLOGIA, 1(2):116142,
Apr. 1977.
E. Schmidt. Zur Theorie der linearen und nichtlinearen
Integralgleichungen. Math. Ann., 63(4):433476, 1907.
A. J. Soper, C. Walshaw, and M. Cross. A combined
evolutionary search and multilevel optimisation
approach to graphpartitioning. J. Global Optim.,
29(2):225241, 2004.
G. W. Stewart. On the early history of the singular
value decomposition. SIAM Rev., 35(4):551566, 1993.
E. R. van Dam and W. H. Haemers. Which graphs are
determined by their spectrum? Linear Algebra Appl.,
373:241272, 2003.

You might also like