Hydrodynamical Formulation of Quantum Mechanics, K Ahler Structure, and Fisher Information
Hydrodynamical Formulation of Quantum Mechanics, K Ahler Structure, and Fisher Information
Hydrodynamical Formulation of Quantum Mechanics, K Ahler Structure, and Fisher Information
Marcel Reginatto
Environmental Measurements Laboratory, U. S. Department of Energy
arXiv:quant-ph/9909065v1 21 Sep 1999
201 Varick St., 5th floor, New York, New York 10014-4811, USA
(February 1, 2008)
1
Abstract
The Schrödinger equation can be derived using the minimum Fisher informa-
tion principle. I discuss why such an approach should work, and also show
that the Kähler and Hilbert space structures of quantum mechanics result
formation
I. INTRODUCTION
In a previous paper [1], it was shown that the hydrodynamical formulation of the
Schrödinger equation can be derived using an information-theoretical approach that is based
on the principle of minimum Fisher information. A derivation along similar lines is also
possible for other non-relativistic quantum mechanical equations, such as the Pauli equation
[2] and the equation for the quantum rotator [3]. The purpose of this paper is two-fold:
to examine why such an information-theoretical approach should work, and to show that
the Kähler and Hilbert space structures of quantum mechanics result from combining the
symplectic structure of the hydrodynamical model with the Fisher information metric of
Fisher information and the Schrödinger equation. Frieden and coworkers later developed
and extended this work in a series of papers which made use of a new principle called the
extreme physical information (EPI) principle. In this paper I will not discuss the EPI
principle, which differs from the principle of minimum Fisher information in many ways (for
2
a review of the EPI approach, see the book by Frieden [5]), but will concentrate instead
on the information-theoretical approach used in [1]. In this approach, the emphasis is
on using the principle of minimum Fisher information to complement a physical picture
derived from a hydrodynamical model. Applying the principle under the assumption that
one can describe the motion of particles in terms of a hydrodynamical model leads directly
to Madelung’s hydrodynamical formulation of quantum mechanics [6].
and let P (y i + ∆y i ) be the density that results from a small change in the y i. Expand the
P (y i +∆y i ) in a Taylor series, and calculate the cross-entropy J up to the first non-vanishing
term,
P (y i + ∆y i ) n
Z
i i i
J(P (y + ∆y ) : P (y )) = P (y i + ∆y i ) ln d y (1)
P (y i)
1 ∂P (y i) ∂P (y i ) n
Z
1
≃ d y ∆y j ∆y k
2 P (y i) ∂y j ∂y k
= Ijk ∆y j ∆y k
The Ijk are the elements of the Fisher information matrix. This is not the most general
expression for the Fisher information matrix, but the particular case that is of interest here.
where P (xi |θi ) is a probability density that depends on a set of n parameters θi in addition
to the n coordinates xi . The expression for the Ijk that appears in equation (1) can be
derived from the general formula if
P (xi|θi ) = P (xi + θi ).
3
1 1 ∂P (y i ) ∂P (y i) n
Z
i
Ijk (θ ) → d y = Ijk
2 P (y i) ∂y j ∂y k
ik 1 1 ∂P ∂P n
Z
ik
I = g Iik = g d y. (3)
2 P ∂y i ∂y k
The case of interest here is the one where M is the n + 1 dimensional extended configuration
space QT (with coordinates {t, x1 , ..., xn }) of a non-relativistic particle of mass m. Then,
the inverse metric is the one used to define the kinematical line element in configuration
space, which is of the form g ik = diag(0, 1/m, ..., 1/m). Sometimes it will be convenient to
use quantities defined over the configuration space Q (with coordinates {x1 , ..., xn }) rather
than QT , and I will do so if it simplifies the notation.
∂S 1 µν ∂S ∂S
+ g +V =0 (4)
∂t 2 ∂xµ ∂xν
where g µν = diag(1/m, ..., 1/m) [8] is the inverse metric used to define the kinematical line
element in the configuration space Q parametrized by coordinates {xµ }. The velocity field
uµ is derived from S according to
∂S
uµ = g µν . (5)
∂xν
When the exact coordinates that describe the state of the classical system are unknown, one
usually describes the system by means of a probability density P (t, xµ ). The probability
density must satisfy the following two conditions: it must be normalized,
4
Z
P dn x = 1,
Equations (4) and (6), together with (5), completely determine the motion of the classical
ensemble. Equations (4) and (6) can be derived from the Lagrangian
∂S 1 µν ∂S ∂S
Z
LCL = P + g + V dtdn x (7)
∂t 2 ∂xµ ∂xν
Fixed end-point variation with respect to S leads again to (6), while fixed end-point variation
with respect to P leads to
2 ∂2P
∂S 1 µν ∂S ∂S 1 ∂P ∂P
+ g +λ − +V =0 (9)
∂t 2 ∂xµ ∂xν P 2 ∂xµ ∂xν P ∂xµ ∂xν
Equations (6) and (9) are identical to the Schrödinger equation provided the wave function
ψ(t, xµ ) is written in terms of S and P by
√
ψ= P exp(iS/~)
Note that the classical limit of the Schrödinger theory is not the Hamilton-Jacobi equation
for a classical particle, but the equations (4) and (6) which describe a classical ensemble.
5
It can be shown (see Appendix) that the Fisher information I increases when P is varied
while S is kept fixed. Therefore, the solution derived here is the one that minimizes the
Fisher information for a given S.
tion to minimize the Fisher information associated with the probability distribution that
describes the position of particles, while the physical content of the theory is contained in
the assumption that one can describe the motion of particles in terms of a hydrodynamical
model.
QUANTUM MECHANICS
The cross-entropy J,
Q(y i)
Z
i
J(Q : P ) = Q(y ) ln dn y.
P (y i)
where P , Q are two probability densities, plays a central role in information theory and in
the theory of inference. It has properties that are desirable for an information measure
[7], and it can be argued that it measures the amount of information needed to change a
prior probability density P into the posterior Q [10]. Maximization of the relative entropy
(which is defined as the negative of the cross-entropy1 ) is the basis of the maximum entropy
1A note on terminology: due to the connection between relative entropy and cross-entropy, the
maximum entropy principle is also known as the minimum cross-entropy principle, which can lead
to some confusion. The cross-entropy (or its negative) may found in the literature under vari-
6
principle, a method for inductive inference that leads to a posterior distribution given a
prior distribution and new information in the form of expected values. The maximum
entropy principle asserts that of all the probability densities that are consistent with the
new information, the one which has the maximum relative entropy is the one that provides
the most unbiased representation of our knowledge of the state of the system. There are
several approaches that lead to the maximum entropy principle. In the original derivation by
Jaynes [11], the use of the maximum entropy principle was justified on the basis of the relative
methods of inference be self-consistent. They introduced a set of axioms that were all
based on one fundamental principle: if a problem can be solved in more than one way,
the results should be consistent. They showed that given information in the form of a
set of constraints on expected values, there is only one distribution satisfying the set of
constraints which can be chosen using a procedure that satisfies their axioms, and this
unique distribution can be obtained by maximizing the relative entropy. Therefore, they
concluded that if a method of inference is based on a variational principle, maximizing any
function but the relative entropy will lead to inconsistencies unless that function and the
relative entropy have identical maxima (any monotonic function of the relative entropy will
information (some similarities and differences of the two approaches were discussed briefly
7
in [1]). But if we take into consideration the unique properties that make cross-entropy
the fundamental measure of information together with the result of Shore and Johnson, it
becomes difficult to justify a principle of inference based on information theory that would
operate along the same lines as maximum entropy but using the principle of minimum
Fisher information instead. To understand the use of the minimum Fisher information
principle in the context of quantum mechanics, it is crucial to take into consideration that
here one is selecting those probability distributions P (y i) for which a perturbation that
leads to P (y i + ∆y i ) will result in the smallest increase of the cross-entropy for a given
S(y i). In other words, the method of choosing P (y i) is based on the idea that a solution
should be stable under perturbations in the very precise sense that the amount of additional
information needed to describe the change in the solution should be as small as possible.
We have then a new principle: choose the probability densities that describe the quantum
system on the basis of the stability of those solutions, where the measure of the stability is
given by the amount of information needed to change P (y i) into P (y i + ∆y i ). Why should
restricting the choice of {P, S} to those that are stable in this sense lead to the excellent
predictions of quantum mechanics? Such an approach should work for physical systems that
can be represented by models in which the probability density P describes the equilibrium
density of an underlying stochastic process (see for example the derivation of the diffusion
equation using the minimum Fisher information principle in [13]). Such models of quantum
mechanics do exist: a formulation along these lines was first proposed by Bohm and Vigier
[14], and later a different but related formulation was given by Nelson [15](for a review of the
stochastic formulation of the quantum theory that compares these two approaches, see [16]).
Whether the additional assumptions needed to build these particular models are sound, and
whether they provide a correct description of quantum mechanics will depend of course on
the experimental predictions that they make. The minimum Fisher information approach
can be of no help here, since it is only concerned with making inferences about probability
distributions and operates therefore at the epistemological level.
8
V. KÄHLER AND HILBERT SPACE STRUCTURES OF QUANTUM
MECHANICS
I now want to examine the assumptions that are needed to construct the Kähler and
Hilbert space structures of quantum mechanics. My aim is not to give a mathematically
rigorous derivation of these results, but to give arguments that justify introducing the Kähler
space structure on the basis of mathematical structures that arise naturally in the hydro-
dynamical model and in information theory. In particular, I want to show that the Kähler
structure of quantum mechanics results from combining the symplectic structure of the hy-
drodynamical model with the Fisher information metric of information theory. The complex
transformation of the hydrodynamical variables that puts this Kähler metric in its canonical
form is the one that leads to the usual Schrödinger representation. Good descriptions of
the geometrical formulation of quantum mechanics covering the case of infinite-dimensional
Kähler manifolds are available in the literature; see for example Cirelli et. al. [17], Ashtekar
and Schilling [18] and Brody and Hughston [19]. The approach of Brody and Hughston is
of special interest in that they make explicit use of the Fisher information metric, although
without making reference to the hydrodynamical formulation.
I first look at the symplectic structure of the hydrodynamical formulation. Introduce
as basic variables the hydrodynamical fields {P, S}. The symplectic structure is given by
where δ and δ ′ are two generic systems of increments for the phase-space variables. The
Poisson brackets for two functions F 1 (P, S), F 2 (P, S) take the form
Z
1
F (P, S), F 2(P, S) =
1
δF /δP δF 2 /δS − δF 1 /δS δF 2 /δP dn x.
9
The equations of motion (6), (9) can be written as
∂P δH
= {P, H} =
∂t δS
∂S δH
= {S, H} = −
∂t δP
To introduce the Fisher information metric, let θµ be a set of real continuous parameters,
and consider the parametric family of positive distributions defined by
where the probability densities P are solutions of the Schrödinger equation (at time t =
0). Then there is a natural metric over the space of parameters θµ given by the Fisher
information matrix [20], and it leads to a concept of distance defined by
Using
∂P µ
δP = δθ
∂θµ
We use equation (11) to introduce a metric over the space of solutions of the Schrödinger
10
where
1
g (P ) =
2P (xµ )
I now want to extend the metric g (P ) over the probability densities to a metric gab over the
whole space {P, S} of solutions of the Schrödinger equation, in such a way that the metric
structure is compatible with the symplectic structure. To do this, introduce a complex
structure J ab and impose the following conditions,
J ab J bc = −δ a c (14)
A set of {Ωab , gab , J ab } that satisfy equations (12), (13) and (14) defines a Kähler structure.
Equation (12) is a compatibility equation between Ωab and gab , equation (13) is the condition
that the metric should be Hermitian, and equation (14) is the condition that J ab should be
a complex structure. Let
0 1
Ωab =
−1 0
Then the solutions gab and J ab to equations (12),(13) and (14) depend on an arbitrary real
function A and are of the form
11
(P )
ℏg A
gab (A) = ,
(P ) −1
A ℏg (1 + A2 )
(P ) −1
(1 + A2 )
A ℏg
J ab (A) = .
−ℏg (P ) −A
The choice of A that leads to the simplest Kähler structure is A = 0, which is a unique
choice in that it leads to the flat Kähler metric. I will show this by carrying out the
complex transformation that leads to the canonical form for the flat Kähler metric. I set
A = 0, and work with the Kähler structure given by
0 1
Ωab = (15)
−1 0
(P )
ℏg 0
gab = (16)
−1
(P )
0 ℏg
(P ) −1
0 ℏg
J ab = (17)
(P )
−ℏg 0
√
ψ= P exp(iS/ℏ)
√
ψ∗ = P exp(−iS/ℏ)
In terms of the new variables, (15), (16) and (17) take the canonical form
0 iℏ
Ωab =
−iℏ 0
12
0 ℏ
gab =
ℏ 0
−i 0
J ab =
0 i
The Madelung transformation is remarkable in that the Hamiltonian takes the very simple
form
~2 µν ∂ψ ∗ ∂ψ
Z
H= g + V ψ ψ dn x,
∗
2 ∂xµ ∂xν
In this way the Hilbert space structure of quantum mechanics results from combining the
symplectic structure of the hydrodynamical model with the Fisher information metric of
information theory.
An important result that comes out of this analysis concerns the issue of suitable boundry
conditions for the fields P and S. It has been pointed out [21] that the Schrödinger
theory is not strictly equivalent to some of the other formulations (i.e., the hydrodynamical
formulation and stochastic mechanics) because features such as the quantization of angular
momentum, which are natural when the theory is formulated in terms of wave functions,
require an additional constraint in a theory formulated in terms of hydrodynamical variables.
13
For example, in the case of the hydrogen atom, the quantization of angular momentum
results from requiring that the wave function be single-valued in configuration space. But
the derivation of the Kähler structure and Hilbert space structure presented here shows
that the Schrödinger representation follows naturally from the hydrodynamical formulation
provided we take into account the role of the Fisher information metric, and furthermore
that this representation is unique in that it is the coordinate system in which the Kähler
structure takes the simplest form. From a purely mathematical point of view, it is not
surprising that the correct boundry conditions are those that are simplest when formulated
in the simplest coordinate system, i.e. single-valuedness of the canonically conjugate fields
ψ, ψ ∗ .
VI. APPENDIX
I want to examine the extremum obtained from the fixed end-point variation of the
Lagrangian LQM , equation (8). In particular, I wish to show the following: given P and S
that satisfy equations (6) and (9), a small variation of the probability density P (xµ , t) →
P (xµ , t)′ = P (xµ , t) + ǫδP (xµ , t) for fixed σ will lead to an increase in LQM , as well as an
increase in the Fisher information I.
I assume fixed end-point variations, and variations ǫδP that are well defined in the sense
that P ′ will have the usual properties required of a probability density (such as P ′ > 0 and
normalization).
Let P → P ′ = P + ǫδP . Since P and S are solutions of the variational problem, the
terms linear in ǫ vanish. If one keeps terms up to order ǫ2 , the change in LQM is given by
14
one can write ∆LQM as
ǫ2 λ
∂ δP ∂ δP
Z
µν
dtdn x + O ǫ3 ,
∆LQM = P g
2 ∂xµ P ∂xυ P
which shows that ∆LQM > 0 for small variations, and therefore that the extremum of
∆LQM is a minimum. Furthermore, since ∆LQM ∼ λ, it is the Fisher information term I
in the Lagrangian ∆LQM that increases, and the extremum is also a minimum of the Fisher
information.
15
REFERENCES
[4] B. Roy Frieden, J. Mod. Opt. 35 (1988) 1297; Am. J. Phys. 57 (1989) 1004.
[5] B. Roy Frieden, Physics from Fisher information (Cambridge Univ. Press, Cambridge,
1999).
[7] S. Kullback, Information Theory and Statistics (Wiley, New York, 1959); corrected and
[8] J. L. Synge, Classical Dynamics, in Encyclopedia of Physics, vol. III/1, ed. S. Flügge
(Springer, Berlin, 1960).
[9] E. T. Jaynes, Clearing up Mysteries - The Original Goal, in: Maximum Entropy and
Bayesian Methods, ed. J. Skilling (Kluwer, Dordrecht, 1989).
[11] E. T. Jaynes,Phys. Rev. 106 (1957) 620; IEEE Trans. Syst. Cybern., SSC-4 (1968) 227.
[12] J. E. Shore and R. Johnson, IEEE Trans. Inform. Theory, IT-26 (1980) 26.
[15] E. Nelson, Phys. Rev. 150B (1966) 1079; Quantum Fluctuations (Princeton Univ. Press,
Princeton, 1985).
16
[17] R. Cirelli, A. Manià and L. Pizzocchero, J. Math. Phys 31 (1990) 2891; 31 (1990) 2898.
[19] D. C. Brody and L. Hughston, Statistical Geometry, submitted to Proc. Roy. Soc. Lond.;
e-Print Archive: gr-qc/9701051.
17