A Higher-Order Structure Tensor: Thomas Schultz, Joachim Weickert, and Hans-Peter Seidel
A Higher-Order Structure Tensor: Thomas Schultz, Joachim Weickert, and Hans-Peter Seidel
A Higher-Order Structure Tensor: Thomas Schultz, Joachim Weickert, and Hans-Peter Seidel
Tensor
Thomas Schultz, Joachim Weickert,
and Hans-Peter Seidel
MPII20074005 July 2007
Authors Addresses
Thomas Schultz
Max-Planck-Institut f ur Informatik
Stuhlsatzenhausweg 85
66123 Saarbr ucken
Germany
[email protected]
Joachim Weickert
Mathematical Image Analysis Group
Faculty of Mathematics and Computer Science
Saarland University, Building E2 4
66041 Saarbr ucken
Germany
[email protected]
Hans-Peter Seidel
Max-Planck-Institut f ur Informatik
Stuhlsatzenhausweg 85
66123 Saarbr ucken
Germany
[email protected]
Acknowledgements
We would like to thank Holger Theisel, who is with Bielefeld University, for
discussions at all stages of this project. Discussions with Torsten Langer, who
is with the MPI Informatik, helped in developing parts of the mathematical
toolbox in Chapter 4.
Our implementation uses the CImg library by David Tschumperle, avail-
able from http://cimg.sf.net/.
This research has partially been funded by the Max Planck Center for
Visual Computing and Communication (MPC-VCC).
Abstract
Structure tensors are a common tool for orientation estimation in image pro-
cessing and computer vision. We present a generalization of the traditional
second-order model to a higher-order structure tensor (HOST), which is able
to model more than one signicant orientation, as found in corners, junctions,
and multi-channel images. We provide a theoretical analysis and a number
of mathematical tools that facilitate practical use of the HOST, visualize it
using a novel glyph for higher-order tensors, and demonstrate how it can be
applied in an improved integrated edge, corner, and junction detector.
Keywords
Structure Tensor, Higher-Order Tensors, Corner Detection, Multivalued Im-
ages
Contents
1 Introduction 2
2 A Higher-Order Structure Tensor 4
3 Glyphs for Higher-Order Tensors 6
3.1 Generalized Ellipses as Higher-Order Tensor Glyphs . . . . . . 6
3.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 7
4 A Mathematical Toolbox 10
4.1 Ecient Representation . . . . . . . . . . . . . . . . . . . . . 10
4.2 Relation to Truncated Fourier Series . . . . . . . . . . . . . . 10
4.3 Generalized Tensor Trace . . . . . . . . . . . . . . . . . . . . . 12
4.4 Generalized Eigenvector Decomposition . . . . . . . . . . . . . 13
4.5 Extrema of the Contrast Function . . . . . . . . . . . . . . . . 14
4.5.1 Accelerating the Brute Force Method . . . . . . . . . . 15
4.5.2 A Faster Method . . . . . . . . . . . . . . . . . . . . . 16
4.5.3 Implementation of the Faster Method . . . . . . . . . . 17
5 Integrated Edge and Junction Detection 18
6 Conclusions and Future Work 21
1
1 Introduction
The second-order structure tensor, formed as the outer product of the image
gradient with itself, is a common tool for local orientation estimation. Since it
was rst introduced for edge and corner detection [9], it has been applied to a
wide variety of problems in image processing and computer vision, including
optic ow estimation [2], image diusion [23], texture segmentation [21],
image inpainting [22], and image compression [10].
Two popular extensions are its generalization to vector- and tensor-valued
images, which goes back to an idea of Di Zenzo [7], and the introduction of
nonlinear local averaging [24], which lead to nonlinear structure tensors [4].
It is a known limitation of the traditional structure tensor that it can
only represent a single dominant orientation. Recently, there have been at-
tempts to overcome this: Arseneau and Cooperstock [1] have placed second-
order structure tensors in discrete directional bins and derived parameters
of multimodal directional distribution functions from them. Their work con-
centrates on lifting the constraint of antipodal symmetry (i.e., they treat
direction v dierently than direction v), a property which our approach
preserves. Moreover, they use the structure tensors only as an intermedi-
ate representation, nally reducing them to two scalar parameters for each
direction.
Herberthson et al. [12] have used outer products to handle pairs of orien-
tations. However, their approach is specic to the case of two orientations:
It neither generalizes to more than two directions, nor does it indicate cases
in which representing a single orientation is sucient.
In our present work, we present a generalization of the second-order struc-
ture tensor to a higher-order tensor model, which is able to capture the ori-
entations of more complex neighborhoods, for example corners, junctions,
and multivalued images. The tensor order allows to specify the maximum
complexity the structure tensor can represent and can be chosen based on
the requirements of a given application.
This report is structured as follows: Chapter 2 introduces our new higher-
2
order structure tensor (HOST). In Chapter 3, we present a novel glyph for
higher-order tensors and use it to visualize rst experimental results. A
theoretical analysis and a number of mathematical tools that help to use
the HOST in practice are presented in Chapter 4. They include an e-
cient representation of the structure tensor, an alternative representation as
a truncated Fourier Series, a generalization of the matrix trace and the eigen-
vector decomposition, and an algorithm to extract contrast extrema from a
higher-order tensor representation. Chapter 5 shows a proof-of-concept ap-
plication, in which the HOST is used for an improved integrated edge and
junction detection. Finally, Chapter 6 concludes this chapter and points out
directions of future research.
3
2 A Higher-Order Structure
Tensor
The standard second-order structure tensor J is given by the outer product
of the image gradient f with itself [9]:
J := ff
T
(2.1)
It is typically averaged over a neighborhood to obtain a descriptor of local im-
age structure. Our generalization to a higher-order tensor J simply repeats
the outer product. For a vector v, taking the outer product with itself l times
will be written v
l
. It yields an order-l tensor, indexed by {i
1
, i
2
, . . . , i
l
}:
_
v
l
_
i
1
i
2
...i
l
:= v
i
1
v
i
2
v
i
l
(2.2)
To ensure antipodal symmetry of the resulting tensor, we choose l to be even.
Interpretation of the higher-order structure tensor requires a contrast
function J, which species the local contrast in a given direction. When
the direction is represented by a unit-length vector u, J(u) is dened by
repeating the inner tensor-vector product of J and u until a scalar is left,
i.e., l times. In n dimensions, this can be written as
J(u) :=
n
i
1
=1
n
i
2
=1
n
i
l
=1
(J)
i
1
i
2
...i
l
u
i
1
u
i
2
u
i
l
(2.3)
This denition is inspired by a work of
Ozarslan and Mareci [18], who have
derived a diusivity function D(u) in the same manner from higher-order
diusion tensors D in the context of generalized diusion tensor magnetic
resonance imaging (DT-MRI). For a second-order structure tensor, J is uni-
modal, which reects the fact that it is suitable to model only one dominant
orientation. For higher orders, J can become multimodal, which allows a
more accurate representation of corners, junctions, and multi-images.
4
We consider it a sensible requirement that the values of the contrast
function should remain comparable, independent of the tensor order that
we use. When evaluated in direction of the gradient, the contrast function
yields the squared gradient magnitude in the second-order case. However,
taking the outer product l times would raise the gradient magnitude to the
lth power. We compensate this by scaling the gradient vector beforehand.
An order-l structure tensor J that reduces to the well-known second-order
tensor J for l = 2 is then given by
J :=
_
f
|f|
l2
l
_
l
(2.4)
In some applications, it is benecial to have a contrast function that gives
the non-squared gradient magnitude [3]. This can be achieved by replacing
the exponent
l2
l
by
l1
l
in Equation (2.4).
5
3 Glyphs for Higher-Order
Tensors
In the literature on higher-order diusion tensors, generalized Reynolds glyphs
constitute the only glyph-based visualization technique [18, 15]. Let S be the
unit sphere and J the contrast function as dened above. Then, these glyphs
are formed by the set of points {J(u)u| u S}, which directly depicts the
contrast prole of the tensor. However, these glyphs have a round shape
around their maxima, which makes their exact orientation dicult to see.
To compensate this problem, Hlawitschka and Scheuermann [15] suggest to
add arrows that point to the maxima.
3.1 Generalized Ellipses as Higher-Order Ten-
sor Glyphs
While the diusion ellipsoid is accepted as the standard glyph for second-
order diusion tensors, the Reynolds glyph does not reduce to it for l = 2.
Since the tensor ellipsoid can be constructed by transforming the unit sphere
under the linear mapping induced by the tensor, it is natural to generalize it
by taking the inner tensor-vector product l 1 times, until a vector is left.
We denote the inner product J u, where
(J u)
i
1
i
2
...i
l1
:=
n
i
l
=1
(J)
i
1
i
2
...i
l
u
i
l
(3.1)
and use the shortcut notation J
l1
u to indicate that we repeat it l1 times.
Then, the surface of our glyph is given by the points {J
l1
u| u S}.
As the 2D examples in Figure 3.1 illustrate, the extrema of the generalized
ellipses coincide with the extrema of the Reynolds glyphs. However, they
6
Figure 3.1: Three tensors of order six, visualized with Reynolds glyphs (a)
and our generalized ellipses (b). In (b), maxima of the contrast function
appear more localized.
develop sharp features around the maxima, at the cost of a smoother shape
around the minima. In examples two and three, the generalized ellipses
immediately make clear that the respective tensors are not axially symmetric,
a fact which the Reynolds glyph may not reveal at rst glance. Since we are
generally more interested in the maxima than in the minima of the contrast
function, we will use the new glyphs in the remainder of this chapter.
3.2 Experimental Results
We will now present some experiments to conrm that higher-order struc-
ture tensors indeed give a more accurate representation of junctions and
multivalued images. Our rst experiment uses simple junctions in synthetic
grayscale images. Derivatives are calculated by convolution with a derivative-
of-Gaussian lter ( = 0.7). After HOSTs of dierent order l have been
computed, their information is propagated to a local neighborhood by con-
volution with a Gaussian kernel ( = 1.4).
Figure 3.2 shows the test images, with the position of the displayed struc-
ture tensor marked by a cross. The results show that a HOST of order l = 4
is sucient to represent two edges that cross orthogonally, while the tradi-
tional structure tensor (l = 2) does not distinguish any particular direction.
In the non-orthogonal case, the traditional model indicates a principal direc-
tion which does not correspond to any gradient found in the image. While
the generalized ellipse of order four gives an impression of the involved direc-
tions, a clear separation of the maxima in the contrast prole now requires
higher orders. However, Chapter 4.4 will show that the generalized eigen-
vectors of the HOST give a good approximation of the gradient directions
already with l = 4.
The second experiment is based on a natural color image. Derivatives are
now calculated channel-wise and according to the conventional generalization
to multi-channel images, the HOSTs of the red, green, and blue color channels
7
(a) Orthogonal edges are clearly distinguished with order l = 4.
(b) For non-orthogonal edges, higher orders give more accurate repre-
sentations.
Figure 3.2: Two junctions in grayscale images and the corresponding struc-
ture tensors. For orders l > 2, the directions of the meeting edges can be
represented.
Figure 3.3: In a color image, the channel-wise gradients may point into
dierent directions. Higher-order structure tensors can be used to model
this situation accurately.
8
are added. In this case, we do not propagate the structure information
( = 0).
For comparison, Figure 3.3 also shows the gradients of the individual
color channels. Again, the structure tensor of order four already gives a much
better impression of the dominant directions than the traditional model. To
demonstrate the feasibility of going to very high tensor orders, we also present
the representation with l = 50.
9
4 A Mathematical Toolbox
4.1 Ecient Representation
An order-l tensor in n dimensions has n
l
tensor channels, which becomes
impractical already for moderate l. However, higher-order structure tensors
are totally symmetric, i.e., invariant under permutation of their indices. This
reduces the number of independent channels to N =
_
n+l1
l
_
, which means
merely linear growth for n = 2 (N = l + 1) and quadratic growth for n = 3.
With some additional notation, it is possible to evaluate J(u) directly
from this non-redundant representation: Call the ith non-redundant element
[J]
i
, stored in a zero-based linear array [J]. Let
i,k
{0, 1, . . . , l} denote
the number of times k {1, 2, . . . , n} appears as an index of the i-th element.
The multiplicity of element i, denoted
i
, is the number of times it appears
as a channel of the original tensor. For n = 2,
i
=
_
l
i,1
_
, for n = 3,
i
=
_
l
i,1
__
l
i,1
i,2
_
. Then, Equation (2.3) can be rewritten as
J(u) =
N1
i=0
i
[J]
i
u
i,1
1
u
i,2
2
u
i,n
n
(4.1)
For n = 2, we chose indices such that
i,1
= l i (e.g., [J
1111
, J
1112
, . . .]).
4.2 Relation to Truncated Fourier Series
From generalized DT-MRI, it is known that using a higher-order tensor model
in 3D is equivalent to approximating the diusivity prole with a truncated
Laplace series [18]. We will now show that the corresponding result in 2D is
a relation of higher-order tensors to truncated Fourier Series. This fact will
serve as the basis of the methods in Chapters 4.3 and 4.5.
10
Consider a Fourier Series, truncated after order l:
f() =
1
2
a
0
+
l
k=1
a
k
cos(k) +
l
k=1
b
k
sin(k) (4.2)
Setting a
k
:= b
k
:= 0 for odd k leaves a l + 1 dimensional vector space of
functions. For n = 2, Equation (4.1) can be rewritten in polar coordinates:
J() =
l
i=0
[J]
i
_
l
i
_
cos
li
sin
i
(4.3)
Let us regard [J]
i
as coecients and
_
l
i
_
cos
li
sin
i
as basis functions. We
will now show that these basis functions span the same space as the truncated
Fourier Series.
Proof by induction on order l. Let {f
k
} denote the basis functions of a
truncated Fourier Series in which only even multiples of are allowed:
f
k
:=
_
_
0.5 if k = 0
cos((k + 1)) if k odd
sin(k) if k even (k = 0)
Likewise, t
l
k
is the k-th basis function of an order-l tensor:
t
l
k
:=
_
l
k
_
cos
lk
sin
k
For l = 0, both the Fourier Series and the tensor basis represent constant
functions and f
0
= 0.5t
0
0
. Assume that the functions that can be represented
using {f
k
} with k l are equivalent to the functions represented by {t
l
k
}.
Further, assume that we know how to express the Fourier basis in terms of
the tensor basis. Then, we can show that the same assumption also holds for
l + 2: Observe that
cos
li
sin
i
=
_
cos
2
+ sin
2
_
cos
li
sin
i
= cos
l+2i
sin
i
+ cos
li
sin
i+2
1
2
[J]
2
b
2
= [J]
1
l = 4 a
0
=
3
4
[J]
0
+
3
2
[J]
2
+
3
4
[J]
4
a
2
=
1
2
[J]
0
1
2
[J]
4
b
2
= [J]
1
+ [J]
3
a
4
=
1
8
[J]
0
3
4
[J]
2
+
1
8
[J]
4
b
4
=
1
2
[J]
1
1
2
[J]
3
Table 4.1: Relation of Fourier coecients and tensor components for orders
l = 2 and l = 4. A method to compute these relations for general l is given
in the text.
It remains to be shown how to express f
l+1
and f
l+2
in terms of {t
l+2
k
}.
For this, we use trigonometric identities for multiple angles:
f
l+1
= cos((l + 2))
=
l/2+1
i=0
(1)
i
_
l + 2
2i
_
cos
l+22i
sin
2i
=
l/2+1
i=0
(1)
i
t
l+2
2i
f
l+2
= sin((l + 2))
=
l/2
i=0
(1)
i
_
l + 2
2i + 1
_
cos
l+12i
sin
2i+1
=
l/2
i=0
(1)
i
t
l+2
2i+1
Our proof is constructive in the sense that it implies a recursive method
to construct a change-of-basis matrix. For reference, Table 4.1 presents the
relations for l = 2 and l = 4.
4.3 Generalized Tensor Trace
The second-order tensor trace has been used as a substitute of the squared
gradient magnitude [8]. For the higher-order case,
Ozarslan et al. [19] have
proposed a generalized trace operation gentr in 3D, which is based on
integrating J over the unit hemisphere and reduces to the standard matrix
trace for l = 2:
gentr(J) :=
3
2
_
J(u) du (4.4)
In the 2D case, is one half of the unit circle and the normalization factor
3
2
is to be replaced with
2
i=0
[J]
2i
(l 1)!!
(l 2i)!! (2i)!!
(4.5)
where l!! is the double factorial, i.e., the product of integers in steps of two.
In the denition of J, we scaled the gradient magnitude such that the
maximum value of J is invariant to the tensor order. However, maxima
become narrower with increasing l, so the generalized trace decreases. It
follows from Equation (4.5) that the generalized trace of an order-l structure
tensor equals
gentr(J) = 2
(l 1)!!
l!!
|f|
2
(4.6)
4.4 Generalized Eigenvector Decomposition
Many applications of the second-order structure tensor depend on its spectral
decomposition into eigenvectors and eigenvalues (e.g., [23, 16, 22, 8, 10]).
In this section, we introduce the Cand (Canonical Decomposition), which
can be regarded as a generalized eigendecomposition for higher-order tensors
and has rst been studied by Hitchcock [13, 14]. A review in the context of
higher-order statistics and some new results are given by Comon et al. [6, 5].
We will concentrate on the symmetric Cand (sCand), which decomposes
a symmetric order-l tensor J into a sum of r outer powers v
l
i
of unit vectors
v
i
, i {1, 2, . . . r}, scaled with
i
:
J =
r
i=1
i
v
l
i
(4.7)
For l = 2, Equation (4.7) reduces to the spectral decomposition, where
i
are
the eigenvalues and v
i
the eigenvectors. It can be shown that any symmetric
higher-order tensor J has a sCand [5]. In analogy to the matrix rank, the
symmetric rank R
S
of J is dened as the smallest number r for which a
sCand exists. In dimension n = 2, it holds that R
S
l [6].
The Cand is a current research topic and while some theoretical results
have been obtained, practical algorithms for ecient computation are rare.
Fortunately, Comon et al. [6] present an algorithm that works for n = 2 and
thus can be applied to our HOSTs. It is outside the scope of this chapter to
review the full theory required to derive the algorithm. For our experiments,
13
Figure 4.1: Generalized eigenvectors can be used to recover individual direc-
tions from a higher-order structure tensor.
we have simply re-implemented the Matlab code given in [6] in C, using
routines from Lapack
1
and the Numerical Recipes [20].
The algorithm returns pairs of
i
and v
i
, where the v
i
are not normalized.
While it appears trivial to convert this result to the canonical form, the
algorithm proves numerically unstable for vectors v
i
which are nearly aligned
with the y-axis: In such cases,
i
tends to zero, while the magnitude of v
i
tends to innity. We work around this problem by reconstructing a tensor J
J := J J
can be rotated by 90
() = 0. The obvious
method to nd such points for an arbitrary dierentiable function J is to
sample its derivative with some resolution r to identify intervals in which
it changes sign and to rene the result to a desired accuracy a by a binary
search for the sign change on these intervals.
This method may miss pairs of extrema whose distance is less than the
sampling resolution r. Fortunately, such pairs are usually only minor local
variations in the contrast function, which are of no practical interest (cf.
Figure 4.2), so we found r = 2
k=1
a
k
cos(k) = u
1
cos u
2
and
l
k=1
b
k
sin(k) = u
1
sin (4.9)
Gentleman [11] has shown that this method magnies roundo errors
when evaluated near = k (k Z) and can produce unusable results for
large orders l. Newbery [17] suggests to adaptively perform a phase shift
of /2 to avoid such cases. However, an experimental comparison of direct
evaluation, the original and the modied version of Clenshaws algorithm
indicates that the error is tolerable for the moderate values of l that occur in
our context: For 160 000 structure tensors of order six from a natural color
image, all methods produced identical results (with r = 2
and a = (2
7
)
, at
single precision). Even with l = 50, all methods gave the same extrema, now
15
(a) cos + (
1
2
+ ) sin 2 (b) cos +
1
2
sin 2 (c) cos + (
1
2
) sin 2
Figure 4.2: In a Fourier Series, a small (here, = 10
10
) can make the
dierence between a pair of extrema, a saddle, and no stationary point. The
top row shows one full period, while the bottom row gives a close-up of the
aected extrema.
with a maximum angular deviation of (2
7
)
() is outside
this corridor, we can be sure that J
()
lies within the corridor; this allows to identify extrema which are so close
that nding them with the brute force approach would be computationally
infeasible. For example, nding the pair of extrema which is shown in Fig-
ure 4.2(a) would require sampling at r (2
10
)
, a = (2
7
)
); for r = 5
,
it vanished.
4.5.3 Implementation of the Faster Method
Since we cannot expect a polynomial of degree three to reasonably approxi-
mate a sine of frequency l in an interval larger than /l, we initially partition
[0, 2) into 2l equal intervals for a Fourier Series of order l. For each of these
intervals, the Taylor expansion is performed by evaluating the higher deriva-
tives at its center
0
. The required Fourier coecients are pre-computed
once. For a third-order approximation, the error bound is
=
J
(5)
()
4!
(
0
)
4
(4.10)
for some within the interval. Taking the order-ve derivative in this ex-
pression is appropriate, since we approximate J
k=1
_
a
2
k
+ b
2
k
(4.11)
and half the interval length is taken for (
0
).
Now, the roots of
J
and
J
at the left
interval boundary, we can go through the sorted error bound intersections to
determine the intervals in which
J