Connection between Fisher metric and the relative entropy


Can someone prove the following connection between Fisher information metric and the relative entropy (or KL divergence) in a purely mathematical rigorous way?

$$D( p(cdot , a+da) | p(.,a) ) =frac{1}{2} g_{i,j} da^{i} da^{j} + (O( |da|^{3})$$
where $a=(a^1,dots, a^n), da=(da^1,dots,da^n)$, $$g_{i,j}=int partial_i (log p(x;a)) partial_j(log p(x;a))~ p(x;a)~dx$$ and $g_{i,j} da^{i} da^{j}:=sum_{i,j}g_{i,j} da^{i} da^{j}$ is the Einstein summation convention.

I found the above in the nice blog of John Baez where Vasileios Anagnostopoulos says about that in the comments.

2 Responses to “Connection between Fisher metric and the relative entropy”


  1. cardinal says:

    You can find a similar relationship (for a one-dimensional parameter) in equation (3) of the following paper

    D. Guo (2009), Relative Entropy and Score Function: New
    Information–Estimation Relationships through Arbitrary Additive
    Perturbation
    ,
    in Proc. IEEE International Symposium on Information Theory,
    814–818. (stable
    link
    ).

    The authors refer to

    S. Kullback, Information Theory and Statistics. New York: Dover, 1968.

    for a proof of this result.

Leave a Reply

Question and Answer is proudly powered by WordPress.
Theme "The Fundamentals of Graphic Design" by Arjuna
Icons by FamFamFam

Warning: Cannot modify header information - headers already sent in /home/zcdecom1/public_html/QANDASYS.INFO/wp-includes/functions.php on line 2705