Hydrodynamical Formulation of Quantum Mechanics, K Ahler Structure, and Fisher Information

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Hydrodynamical formulation of quantum mechanics,

Kähler structure, and Fisher information

Marcel Reginatto
Environmental Measurements Laboratory, U. S. Department of Energy
arXiv:quant-ph/9909065v1 21 Sep 1999

201 Varick St., 5th floor, New York, New York 10014-4811, USA

(February 1, 2008)

Typeset using REVTEX

1
Abstract

The Schrödinger equation can be derived using the minimum Fisher informa-

tion principle. I discuss why such an approach should work, and also show

that the Kähler and Hilbert space structures of quantum mechanics result

from combining the symplectic structure of the hydrodynamical model with

the Fisher information metric.

PACS: 03.65.Bz; 89.70.+c

Keywords: Schrödinger; hydrodynamical formulation; Kähler; Fisher in-

formation

I. INTRODUCTION

In a previous paper [1], it was shown that the hydrodynamical formulation of the
Schrödinger equation can be derived using an information-theoretical approach that is based
on the principle of minimum Fisher information. A derivation along similar lines is also

possible for other non-relativistic quantum mechanical equations, such as the Pauli equation
[2] and the equation for the quantum rotator [3]. The purpose of this paper is two-fold:
to examine why such an information-theoretical approach should work, and to show that
the Kähler and Hilbert space structures of quantum mechanics result from combining the
symplectic structure of the hydrodynamical model with the Fisher information metric of

information theory. The complex transformation of the hydrodynamical variables that


puts this Kähler metric in its canonical form is the one that leads to the usual Schrödinger
representation.
Frieden [4] was the first one to point out a connection between the principle of minimum

Fisher information and the Schrödinger equation. Frieden and coworkers later developed
and extended this work in a series of papers which made use of a new principle called the
extreme physical information (EPI) principle. In this paper I will not discuss the EPI
principle, which differs from the principle of minimum Fisher information in many ways (for

2
a review of the EPI approach, see the book by Frieden [5]), but will concentrate instead
on the information-theoretical approach used in [1]. In this approach, the emphasis is
on using the principle of minimum Fisher information to complement a physical picture

derived from a hydrodynamical model. Applying the principle under the assumption that
one can describe the motion of particles in terms of a hydrodynamical model leads directly
to Madelung’s hydrodynamical formulation of quantum mechanics [6].

II. CROSS-ENTROPY AND FISHER INFORMATION

Let P (y i) be a probability density which is a function of n continuous coordinates y i,

and let P (y i + ∆y i ) be the density that results from a small change in the y i. Expand the
P (y i +∆y i ) in a Taylor series, and calculate the cross-entropy J up to the first non-vanishing
term,

P (y i + ∆y i ) n
Z
i i i
J(P (y + ∆y ) : P (y )) = P (y i + ∆y i ) ln d y (1)
P (y i)
1 ∂P (y i) ∂P (y i ) n
 Z 
1
≃ d y ∆y j ∆y k
2 P (y i) ∂y j ∂y k
= Ijk ∆y j ∆y k

The Ijk are the elements of the Fisher information matrix. This is not the most general
expression for the Fisher information matrix, but the particular case that is of interest here.

The general expression is of the form [7]

1 1 ∂P (xi |θi ) ∂P (xi |θi ) n


Z
i
Ijk (θ ) = d x (2)
2 P (xi |θi ) ∂θj ∂θk

where P (xi |θi ) is a probability density that depends on a set of n parameters θi in addition
to the n coordinates xi . The expression for the Ijk that appears in equation (1) can be
derived from the general formula if

P (xi|θi ) = P (xi + θi ).

To see this, introduce a new set of parameters y i = xi + θi . Then

3
1 1 ∂P (y i ) ∂P (y i) n
Z
i
Ijk (θ ) → d y = Ijk
2 P (y i) ∂y j ∂y k

since dn x → dn y as the integration over the xi coordinates is for fixed values of θi .

If P is defined over an n-dimensional manifold M with (positive) inverse metric g ik , there


is a natural definition of the amount of information I associated with P , which is obtained
by contracting g ik with the elements of the Fisher information matrix,

ik 1 1 ∂P ∂P n
Z
ik
I = g Iik = g d y. (3)
2 P ∂y i ∂y k

The case of interest here is the one where M is the n + 1 dimensional extended configuration
space QT (with coordinates {t, x1 , ..., xn }) of a non-relativistic particle of mass m. Then,
the inverse metric is the one used to define the kinematical line element in configuration

space, which is of the form g ik = diag(0, 1/m, ..., 1/m). Sometimes it will be convenient to
use quantities defined over the configuration space Q (with coordinates {x1 , ..., xn }) rather
than QT , and I will do so if it simplifies the notation.

III. DERIVATION OF THE SCHRÖDINGER EQUATION

In the Hamilton-Jacobi formulation of classical mechanics, the equation of motion takes


the form

∂S 1 µν ∂S ∂S
+ g +V =0 (4)
∂t 2 ∂xµ ∂xν

where g µν = diag(1/m, ..., 1/m) [8] is the inverse metric used to define the kinematical line
element in the configuration space Q parametrized by coordinates {xµ }. The velocity field
uµ is derived from S according to

∂S
uµ = g µν . (5)
∂xν

When the exact coordinates that describe the state of the classical system are unknown, one
usually describes the system by means of a probability density P (t, xµ ). The probability
density must satisfy the following two conditions: it must be normalized,

4
Z
P dn x = 1,

and it must satisfy a continuity equation,


 
∂ ∂ µν ∂S
P + µ Pg = 0. (6)
∂t ∂x ∂xν

Equations (4) and (6), together with (5), completely determine the motion of the classical

ensemble. Equations (4) and (6) can be derived from the Lagrangian
 
∂S 1 µν ∂S ∂S
Z
LCL = P + g + V dtdn x (7)
∂t 2 ∂xµ ∂xν

by fixed end-point variation (δP = δS = 0 at the boundaries) with respect to S and P .


Quantization of the classical ensemble is achieved by adding to the classical Lagrangian
(7) a term proportional to the information I defined by equation (3) [1]. This leads to the

Lagrangian for the Schrödinger equation,

LQM = LCL + λI (8)


   
∂S 1 µν ∂S ∂S 1 ∂P ∂P
Z
= P + g + λ 2 µ ν + V dtdn x.
∂t 2 µ
∂x ∂xν P ∂x ∂x

Fixed end-point variation with respect to S leads again to (6), while fixed end-point variation
with respect to P leads to

2 ∂2P
  
∂S 1 µν ∂S ∂S 1 ∂P ∂P
+ g +λ − +V =0 (9)
∂t 2 ∂xµ ∂xν P 2 ∂xµ ∂xν P ∂xµ ∂xν

Equations (6) and (9) are identical to the Schrödinger equation provided the wave function
ψ(t, xµ ) is written in terms of S and P by

ψ= P exp(iS/~)

and the parameter λ is set equal to


 2
~
λ= .
2

Note that the classical limit of the Schrödinger theory is not the Hamilton-Jacobi equation
for a classical particle, but the equations (4) and (6) which describe a classical ensemble.

5
It can be shown (see Appendix) that the Fisher information I increases when P is varied
while S is kept fixed. Therefore, the solution derived here is the one that minimizes the
Fisher information for a given S.

The approach followed here is of interest in that it provides a way of distinguishing


between physical and information-theoretical assumptions (for a very clear account of the
importance of making this type of distinction in quantum mechanics see the paper by Jaynes
[9]). In general terms, the information-theoretical content of the theory lies in the prescrip-

tion to minimize the Fisher information associated with the probability distribution that
describes the position of particles, while the physical content of the theory is contained in
the assumption that one can describe the motion of particles in terms of a hydrodynamical
model.

IV. ON THE USE OF THE MINIMUM FISHER INFORMATION PRINCIPLE IN

QUANTUM MECHANICS

The cross-entropy J,

Q(y i)
Z  
i
J(Q : P ) = Q(y ) ln dn y.
P (y i)

where P , Q are two probability densities, plays a central role in information theory and in
the theory of inference. It has properties that are desirable for an information measure
[7], and it can be argued that it measures the amount of information needed to change a

prior probability density P into the posterior Q [10]. Maximization of the relative entropy
(which is defined as the negative of the cross-entropy1 ) is the basis of the maximum entropy

1A note on terminology: due to the connection between relative entropy and cross-entropy, the

maximum entropy principle is also known as the minimum cross-entropy principle, which can lead

to some confusion. The cross-entropy (or its negative) may found in the literature under vari-

ous names: Kullback-Leibler information, directed divergence, discrimination information, Renyi’s

6
principle, a method for inductive inference that leads to a posterior distribution given a
prior distribution and new information in the form of expected values. The maximum
entropy principle asserts that of all the probability densities that are consistent with the

new information, the one which has the maximum relative entropy is the one that provides
the most unbiased representation of our knowledge of the state of the system. There are
several approaches that lead to the maximum entropy principle. In the original derivation by
Jaynes [11], the use of the maximum entropy principle was justified on the basis of the relative

entropy’s unique properties as an uncertainty measure. An independent justification based


on consistency arguments was later given by Shore and Johnson [12]. Jaynes had already
remarked that inferences made using any other information measure than the entropy may
lead to contradictions. Shore and Johnson considered the consequences of requiring that

methods of inference be self-consistent. They introduced a set of axioms that were all
based on one fundamental principle: if a problem can be solved in more than one way,
the results should be consistent. They showed that given information in the form of a
set of constraints on expected values, there is only one distribution satisfying the set of

constraints which can be chosen using a procedure that satisfies their axioms, and this
unique distribution can be obtained by maximizing the relative entropy. Therefore, they
concluded that if a method of inference is based on a variational principle, maximizing any
function but the relative entropy will lead to inconsistencies unless that function and the
relative entropy have identical maxima (any monotonic function of the relative entropy will

work, for example).


It is tempting to argue by analogy that the minimum Fisher information derivation of
the Schrödinger equation is in essence nothing but a variation on maximum entropy, one
in which maximization of relative entropy is simply replaced by minimization of the Fisher

information (some similarities and differences of the two approaches were discussed briefly

information gain, expected weight of evidence, entropy, entropy distance.

7
in [1]). But if we take into consideration the unique properties that make cross-entropy
the fundamental measure of information together with the result of Shore and Johnson, it
becomes difficult to justify a principle of inference based on information theory that would

operate along the same lines as maximum entropy but using the principle of minimum
Fisher information instead. To understand the use of the minimum Fisher information
principle in the context of quantum mechanics, it is crucial to take into consideration that
here one is selecting those probability distributions P (y i) for which a perturbation that

leads to P (y i + ∆y i ) will result in the smallest increase of the cross-entropy for a given
S(y i). In other words, the method of choosing P (y i) is based on the idea that a solution
should be stable under perturbations in the very precise sense that the amount of additional
information needed to describe the change in the solution should be as small as possible.

We have then a new principle: choose the probability densities that describe the quantum
system on the basis of the stability of those solutions, where the measure of the stability is
given by the amount of information needed to change P (y i) into P (y i + ∆y i ). Why should
restricting the choice of {P, S} to those that are stable in this sense lead to the excellent

predictions of quantum mechanics? Such an approach should work for physical systems that
can be represented by models in which the probability density P describes the equilibrium
density of an underlying stochastic process (see for example the derivation of the diffusion
equation using the minimum Fisher information principle in [13]). Such models of quantum
mechanics do exist: a formulation along these lines was first proposed by Bohm and Vigier

[14], and later a different but related formulation was given by Nelson [15](for a review of the
stochastic formulation of the quantum theory that compares these two approaches, see [16]).
Whether the additional assumptions needed to build these particular models are sound, and
whether they provide a correct description of quantum mechanics will depend of course on

the experimental predictions that they make. The minimum Fisher information approach
can be of no help here, since it is only concerned with making inferences about probability
distributions and operates therefore at the epistemological level.

8
V. KÄHLER AND HILBERT SPACE STRUCTURES OF QUANTUM

MECHANICS

I now want to examine the assumptions that are needed to construct the Kähler and
Hilbert space structures of quantum mechanics. My aim is not to give a mathematically
rigorous derivation of these results, but to give arguments that justify introducing the Kähler

space structure on the basis of mathematical structures that arise naturally in the hydro-
dynamical model and in information theory. In particular, I want to show that the Kähler
structure of quantum mechanics results from combining the symplectic structure of the hy-
drodynamical model with the Fisher information metric of information theory. The complex

transformation of the hydrodynamical variables that puts this Kähler metric in its canonical
form is the one that leads to the usual Schrödinger representation. Good descriptions of
the geometrical formulation of quantum mechanics covering the case of infinite-dimensional
Kähler manifolds are available in the literature; see for example Cirelli et. al. [17], Ashtekar

and Schilling [18] and Brody and Hughston [19]. The approach of Brody and Hughston is
of special interest in that they make explicit use of the Fisher information metric, although
without making reference to the hydrodynamical formulation.
I first look at the symplectic structure of the hydrodynamical formulation. Introduce
as basic variables the hydrodynamical fields {P, S}. The symplectic structure is given by

the two form


   
µ
0 1   δ P (x )  n

Z 
 
µ µ ′ µ ′ µ µ µ 
ω(δP (x ), δS(x ); δ P (x ), δ S(x )) = (δP (x ), δS(x ))    d x
−1 0 δ ′ S(xµ ) 

 
  
µ
 δ P (x ) 
Z 
 ′ 

µ µ n
= (δP (x ), δS(x )) · Ω ·   d x
δ ′ S(xµ ) 

 

where δ and δ ′ are two generic systems of increments for the phase-space variables. The

Poisson brackets for two functions F 1 (P, S), F 2 (P, S) take the form
Z
 1
F (P, S), F 2(P, S) =
 1
δF /δP δF 2 /δS − δF 1 /δS δF 2 /δP dn x.
    

9
The equations of motion (6), (9) can be written as

∂P δH
= {P, H} =
∂t δS
∂S δH
= {S, H} = −
∂t δP

with the Hamiltonian H given by


( "  2 # )
1 µν ∂S ∂S 1 ∂P ∂P
Z
~
H= P g µ ν
+ 2 µ ν
+ V dn x.
2 ∂x ∂x 2 P ∂x ∂x

H acts as the generator of time translations.

To introduce the Fisher information metric, let θµ be a set of real continuous parameters,
and consider the parametric family of positive distributions defined by

P (xµ |θµ ) = P (xµ + θµ )

where the probability densities P are solutions of the Schrödinger equation (at time t =
0). Then there is a natural metric over the space of parameters θµ given by the Fisher
information matrix [20], and it leads to a concept of distance defined by

∂P (xµ |θµ ) ∂P (xµ |θµ ) n


Z 
2 µ 1 1
ds (θ ) = d x δθρ δθσ (10)
2 P (xµ |θµ ) ∂θρ ∂θσ

Using

∂P µ
δP = δθ
∂θµ

one can write equation (10) as


Z 
2 1
µ 1 µ µ µ µ n
ds (θ ) = δP (x |θ )δP (x |θ )d x (11)
2 P (xµ |θµ )

We use equation (11) to introduce a metric over the space of solutions of the Schrödinger

equation (i.e., P (xµ |θµ ) with θµ = 0) by setting


Z 
2 ′ 1 1 µ ′ µ 3
ds (δP, δ P ) = δP (x )δ P (x )d x
2 P (xµ )
Z
= g (P ) δP (xµ )δ ′ P (xµ )d3 x

10
where

P (xµ ) = P (xµ |θµ = 0),

δP (xµ ) = δP (xµ |θµ )|θµ =0

1
g (P ) =
2P (xµ )

I now want to extend the metric g (P ) over the probability densities to a metric gab over the
whole space {P, S} of solutions of the Schrödinger equation, in such a way that the metric
structure is compatible with the symplectic structure. To do this, introduce a complex
structure J ab and impose the following conditions,

Ωab = gac J cb (12)

J ac gab J bd = gcd (13)

J ab J bc = −δ a c (14)

A set of {Ωab , gab , J ab } that satisfy equations (12), (13) and (14) defines a Kähler structure.
Equation (12) is a compatibility equation between Ωab and gab , equation (13) is the condition

that the metric should be Hermitian, and equation (14) is the condition that J ab should be
a complex structure. Let
 
 0 1
Ωab =  
−1 0

and require that gab be a real, symmetric matrix of the form


 
(P )
 ℏg · 
gab =  .
· ·

Then the solutions gab and J ab to equations (12),(13) and (14) depend on an arbitrary real
function A and are of the form

11
 
(P )
 ℏg A
gab (A) =  ,

(P ) −1

A ℏg (1 + A2 )

 
(P ) −1
(1 + A2 ) 

A ℏg
J ab (A) =  .

−ℏg (P ) −A

The choice of A that leads to the simplest Kähler structure is A = 0, which is a unique
choice in that it leads to the flat Kähler metric. I will show this by carrying out the
complex transformation that leads to the canonical form for the flat Kähler metric. I set
A = 0, and work with the Kähler structure given by
 
 0 1
Ωab =   (15)
−1 0

 
(P )
 ℏg 0
gab =  (16)

−1

(P )

0 ℏg

 
(P ) −1

0 ℏg
J ab =  (17)
 

(P )
−ℏg 0

The complex coordinate transformation is nothing but the Madelung transformation


ψ= P exp(iS/ℏ)


ψ∗ = P exp(−iS/ℏ)

In terms of the new variables, (15), (16) and (17) take the canonical form
 
 0 iℏ 
Ωab =  
−iℏ 0

12
 
0 ℏ
gab =  
ℏ 0

 
 −i 0 
J ab =  
0 i

The Madelung transformation is remarkable in that the Hamiltonian takes the very simple
form

~2 µν ∂ψ ∗ ∂ψ
Z  
H= g + V ψ ψ dn x,

2 ∂xµ ∂xν

and the equations of motion become linear.


Finally, one introduces a Hilbert space structure using gab , Ωab to define the Dirac prod-
uct. For two wave functions φ, ϕ define the Dirac product by
  
µ
 ϕ(x ) 
Z  
1 
µ ∗ µ

n
< φ|ϕ > = (φ(x ), φ (x )) · [g + iΩ] ·   d x
2ℏ  
ϕ∗ (xµ ) 

      
µ
0 ℏ  0 iℏ   ϕ(x ) 
Z  
1 
µ ∗ µ 

n
= (φ(x ), φ (x ))   + i  d x
2ℏ 
 
µ
ℏ 0 −iℏ 0 ϕ (x ) 

 
Z
= φ∗ (xµ )ϕ(xµ )dn x

In this way the Hilbert space structure of quantum mechanics results from combining the

symplectic structure of the hydrodynamical model with the Fisher information metric of
information theory.
An important result that comes out of this analysis concerns the issue of suitable boundry
conditions for the fields P and S. It has been pointed out [21] that the Schrödinger

theory is not strictly equivalent to some of the other formulations (i.e., the hydrodynamical
formulation and stochastic mechanics) because features such as the quantization of angular
momentum, which are natural when the theory is formulated in terms of wave functions,
require an additional constraint in a theory formulated in terms of hydrodynamical variables.

13
For example, in the case of the hydrogen atom, the quantization of angular momentum
results from requiring that the wave function be single-valued in configuration space. But
the derivation of the Kähler structure and Hilbert space structure presented here shows

that the Schrödinger representation follows naturally from the hydrodynamical formulation
provided we take into account the role of the Fisher information metric, and furthermore
that this representation is unique in that it is the coordinate system in which the Kähler
structure takes the simplest form. From a purely mathematical point of view, it is not

surprising that the correct boundry conditions are those that are simplest when formulated
in the simplest coordinate system, i.e. single-valuedness of the canonically conjugate fields
ψ, ψ ∗ .

VI. APPENDIX

I want to examine the extremum obtained from the fixed end-point variation of the
Lagrangian LQM , equation (8). In particular, I wish to show the following: given P and S

that satisfy equations (6) and (9), a small variation of the probability density P (xµ , t) →
P (xµ , t)′ = P (xµ , t) + ǫδP (xµ , t) for fixed σ will lead to an increase in LQM , as well as an
increase in the Fisher information I.
I assume fixed end-point variations, and variations ǫδP that are well defined in the sense

that P ′ will have the usual properties required of a probability density (such as P ′ > 0 and
normalization).
Let P → P ′ = P + ǫδP . Since P and S are solutions of the variational problem, the
terms linear in ǫ vanish. If one keeps terms up to order ǫ2 , the change in LQM is given by

∆LQM = LQM (P ′ , S) − LQM (P, S)


ǫ2 λ (δP )2 ∂P ∂P
 
2(δP ) ∂P ∂(δP ) 1 ∂(δP ) ∂(δP )
Z
µν
dtdn x + O ǫ3 .

= g 3 µ υ
− 2 µ υ
+ µ υ
2 P ∂x ∂x P ∂x ∂x P ∂x ∂x

Using the relation


     2 
µν ∂ δP ∂ δP µν δP ∂P ∂P 2δP ∂P ∂δP 1 ∂δP ∂δP
Pg =g − 2 + ,
∂xµ P ∂xυ P P 3 ∂xµ ∂xυ P ∂xµ ∂xυ P ∂xµ ∂xυ

14
one can write ∆LQM as

ǫ2 λ
    
∂ δP ∂ δP
Z
µν
dtdn x + O ǫ3 ,

∆LQM = P g
2 ∂xµ P ∂xυ P

which shows that ∆LQM > 0 for small variations, and therefore that the extremum of
∆LQM is a minimum. Furthermore, since ∆LQM ∼ λ, it is the Fisher information term I
in the Lagrangian ∆LQM that increases, and the extremum is also a minimum of the Fisher
information.

15
REFERENCES

[1] M. Reginatto, Phys. Rev. A 58 (1998) 1775.

[2] M. Reginatto, Phys. Lett. A 249 (1998) 355.

[3] M. Reginatto (unpublished).

[4] B. Roy Frieden, J. Mod. Opt. 35 (1988) 1297; Am. J. Phys. 57 (1989) 1004.

[5] B. Roy Frieden, Physics from Fisher information (Cambridge Univ. Press, Cambridge,
1999).

[6] E. Madelung, Z. Phys. 40 (1926) 322.

[7] S. Kullback, Information Theory and Statistics (Wiley, New York, 1959); corrected and

revised edition (Dover, New York, 1968).

[8] J. L. Synge, Classical Dynamics, in Encyclopedia of Physics, vol. III/1, ed. S. Flügge
(Springer, Berlin, 1960).

[9] E. T. Jaynes, Clearing up Mysteries - The Original Goal, in: Maximum Entropy and
Bayesian Methods, ed. J. Skilling (Kluwer, Dordrecht, 1989).

[10] A. Hobson, J. Stat. Phys. 1 (1969) 383.

[11] E. T. Jaynes,Phys. Rev. 106 (1957) 620; IEEE Trans. Syst. Cybern., SSC-4 (1968) 227.

[12] J. E. Shore and R. Johnson, IEEE Trans. Inform. Theory, IT-26 (1980) 26.

[13] M. Reginatto and F. Lengyel, submitted to Phys. Lett. A.

[14] D. Bohm and J-P. Vigier, Phys. Rev. 96 (1954) 208.

[15] E. Nelson, Phys. Rev. 150B (1966) 1079; Quantum Fluctuations (Princeton Univ. Press,
Princeton, 1985).

[16] D. Bohm and B.J.Hiley, Phys. Rep. 172 (1989) 93.

16
[17] R. Cirelli, A. Manià and L. Pizzocchero, J. Math. Phys 31 (1990) 2891; 31 (1990) 2898.

[18] A. Ashtekar and T. A Schilling, Geometrical Formulation of Quantum Mechanics, in:


On Einstein’s Path, Essays in Honor of Engelbert Schücking, ed. A. Harvey (Springer,
Berlin, 1999).

[19] D. C. Brody and L. Hughston, Statistical Geometry, submitted to Proc. Roy. Soc. Lond.;
e-Print Archive: gr-qc/9701051.

[20] C. R. Rao, Bull. Calcutta Math. Soc. 37 (1945) 81.

[21] T. C. Wallstrom, Phys. Rev. A 49 (1994) 1613.

17

You might also like