1 Introduction
Symmetric positive definite (SPD) matrices emerge in vast scientific applications such as computer vision
[9, 35], elasticity [18, 31], signal processing [3, 21], medical imaging [11, 13, 14, 27, 39] and neuroscience [15]. A concrete example is analysis of functional connectivity between brain regions. Such connectivity is often characterized by the covariance of bloodoxygenlevel dependent signals [22]generated by brain activities from different regions. The covariance is mathematically defined by a covariance matrix which is an SPD matrix. Another application is diffusion tensor imaging
[25], which is extensively used to obtain highresolution information of internal structures of certain tissues or organs, such as hearts and brains. For each tissue voxel, there is a SPD matrix to describe the shape of local diffusion. Such information has clinical applications; for example, it can be used to discover pathological area surrounded by healthy tissues.The space of SPD matrices of a fixed dimension , denoted by in this article, is a convex smooth submanifold of the Euclidean space . The inherited Euclidean metric further turns into a Riemannian manifold. However, as pointed out in [4]
, this classic metric is not adequate in many applications for two reasons. First, the distance between SPD matrices and symmetric matrices with zero or negative eigenvalues is finite, which implies that, in the context of diffusion tensor imaging, small diffusion is more likely than large diffusion. Second, the Euclidean average of SPD matrices suffers from swelling effect, i.e., the determinant of the average is larger than any of the original determinants. When SPD matrices are covariance matrices, as in the application of diffusion tensor imaging, determinants correspond to overall dispersion of diffusion. Inflated determinants amount to extra diffusion that is artificially introduced in computation.
To circumvent the problems of the Euclidean metric for SPD matrices, various metrics have been introduced in the literature, such as the affineinvariant metric [30, 34] and the LogEuclidean metric [4]. These metrics keep symmetric matrices with some nonpositive eigenvalues at an infinite distance away from SPD matrices, and are not subject to swelling effect. In addition, the LogEuclidean framework features a closed form of the Fréchet average of SPD matrices. It also turns into a Lie group endowed with a biinvariant metric. However, computation of Riemannian exponential and logarithmic maps requires evaluating a series of an infinite number of terms; see Eq. (2.1) and (3.4) in [4]. Comparing to the LogEuclidean metric, the affineinvariant one not only possesses easytocompute exponential and logarithmic maps, but also enjoys a closed form for parallel transport along geodesics; see Lemma 3 of [36]. However, to the best of our knowledge, no closed form is found for the Fréchet average of SPD matrices under the affineinvariant metric. The Fréchet average of SPD matrices is also studied in the literature for distance functions or Riemannian metrics arising from perspectives other than swelling effect, such as the BuresWasserstein metric that is related to the theory of optimal transport [7], and the Sdivergence studied in both [10] and [37]. Other related works include Riemannian geometry for positive semidefinite matrices [38, 28] and Riemannian structure for correlation matrices [17].
In addition to the above Riemannian frameworks, it is also common to approach SPD matrices via Cholesky decomposition in practice for efficient computation, such as [12, 32, 39]. Distance on SPD matrices based on Cholesky decomposition has also been explored in the literature. For example, in [11] the distance between two SPD matrices and with Cholesky decomposition and is defined by , where each of and is a lower triangular matrix whose diagonal elements are positive, and denotes Frobenius matrix norm. Although this distance is simple and easy to compute, it suffers from swelling effect, as demonstrated by the following example.
Example 1
One first notes that, under the Cholesky distance, the geodesic interpolation between
and is given by for . For any , consider matricesIt is clear that and . When ,
whose determinant is . However, , or equivalently, , whenever .
In this work, we propose a new Riemannian metric on SPD matrices via Cholesky decomposition. The basic idea is to introduce a new metric for the space of lower triangular matrices with positive diagonal elements and then push it forward to the space of SPD matrices via Cholesky decomposition. The metric, termed LogCholesky metric, has the advantages of the aforementioned affineinvariant metric, LogEuclidean metric and Cholesky distance. First, it is as simple as the Cholesky distance, but not subject to swelling effect. Second, like the LogEuclidean metric, the presented metric enjoys Lie group biinvariance, as well as a closed form for the LogCholesky average of SPD matrices. This biinvariant Lie group structure seems not shared by the aforementioned works other than [4] in the literature. Third, it features simple and easytocompute expressions for Riemannian exponential and logarithmic maps, in contrast with the LogEuclidean metric. Finally, like the affineinvariant metric, the expression for parallel transport along geodesics is simple and easytocompute under the presented metric. Parallel transport is important in applications like regression methods on Riemannian manifolds, such as [36, 40, 41].
It is noted that Cholesky decomposition is also explored in [17] for a Riemannian geometry of correlation matrices with rank no larger than a fixed bound. Despite certain similarity in the use of Cholesky decomposition, this work is fundamentally different from ours. First, it studies correlation matrices rather than SPD matrices. For a correlation matrix, its diagonal elements are restricted to be one. Second, the Riemannian structures considered in [17] and our work are different. For example, the socalled Cholesky manifold in [17] is a Riemannian submanifold of a Euclidean space, while our Riemannian manifold to be proposed is not. Finally, Cholesky decomposition is utilized in [17] as a way to parameterize correlation matrices, rather than push forward a new manifold structure to correlation matrices.
We structure the rest of this article as follows. Some notations and basic properties of lower triangular and SPD matrices are collected in Section 2. In Section 3, we introduce a new Lie group structure on SPD matrices and define the LogCholesky metric on the group. Basic features such as Riemannian exponential/logarithmic maps, geodesics and parallel transport are also characterized. Section 4 is devoted to the LogCholesky mean/average of distributions on SPD matrices. We then conclude the article in Section 5.
2 Lower triangular matrices and SPD matrices
We start with introducing some notations and recalling some basic properties of lower triangular and SPD matrices. Cholesky decomposition is then shown to be a diffeomorphism between lower triangular matrix manifolds and SPD manifolds. This result serves as a cornerstone of our development: it enables us to push forward a Riemannian metric defined on the space of triangular matrices to the space of SPD matrices.
2.1 Notations and basic properties
Throughout this paper, is a fixed positive integer that represents the dimension of matrices under consideration. For a matrix , we use or to denote its element on the th row and th column. The notation denotes an matrix whose element is if and is zero otherwise, while denotes an diagonal matrix whose element is . In other words, is the strictly lower triangular part, while is the diagonal part of . The trace of a matrix is denoted by , and the determinant is denoted by . For two square matrices and , denotes the Frobenius inner product between them, and the induced norm is denoted by .
The matrix exponential map of a real matrix is defined by , and its inverse, the matrix logarithm, whenever it exists and is real, is denoted by . It is noted that the exponential of a lower triangular matrix is also lower triangular. In addition, the matrix exponential of a diagonal matrix can be obtained by applying the exponential function to each diagonal element. The matrix logarithm of a diagonal matrix with positive diagonal elements can be computed in a similar way. Thus, the matrix exponential/logarithmic map of a diagonal matrix is diagonal.
The space of lower triangular matrices is denoted by , and the subset of whose diagonal elements are all positive is denoted by . It is straightforward to check the following properties of lower triangular matrices.

for .

and for .

and if .

If , then the inverse exists and belongs to .

For , and .

for .

for .
These properties show that both and are closed under matrix addition and multiplication, and that the operator interacts well with these operations.
Recall that is defined as the collection of SPD matrices. We denote the space of symmetric space by . Symmetric matrices and SPD matrices possess numerous algebraic and analytic properties that are well documented in [6]. Below are some of them to be used in the sequel.

All eigenvalues of an SPD are positive, and . Therefore, the determinant of an SPD matrix is positive.

is an SPD matrix for a symmetric matrix , while is a symmetric metric for an SPD matrix .

Diagonal elements of an SPD matrix are all positive. This can be seen from the fact that for , where
is the unit vector with 1 at the
th coordinate and 0 elsewhere.
2.2 Cholesky Decomposition
Cholesky decomposition, named after AndréLouis Cholesky, represents a real SPD matrix as a product of a lower triangular matrix and its transpose, i.e., . If the diagonal elements of are restricted to be positive, then the decomposition is unique according to Theorem 4.2.5 of [16]. Such lower triangular matrix, denoted by , is called the Cholesky factor of . Since in addition for each , the map is bijective. In other words, there is onetoone correspondence between SPD matrices and lower triangular matrices whose diagonal elements are all positive.
The space , called the Cholesky space in this paper, is a smooth submanifold of that is identified with the Euclidean space . Similarly, the space of SPD matrices is a smooth submanifold of the space of symmetric matrices identified with vectors in . As a manifold map between smooth manifolds and , the map is indeed a diffeomorphism. This fact will be explored to endow with a new Riemannian metric that to be presented in Section 3.2.
Proposition 2
The Cholesky map is a diffeomorphism between smooth manifolds and .
Proof
We have argued that is a bijection. To see that it is also smooth, for with , we write
from which we deduce that
(1) 
The existence of a unique Cholesky factor for every SPD matrix suggests that for all . Thus, , as well as its reciprocal , is smooth.
Now assume and are smooth for and . As we just showed, this hypothesis is true for and . If , from Eq. 1 we see that is smooth. If , then results from a sequence of elementary operations, such as multiplication, addition and subtraction, of maps , and that are all smooth according to the induction hypothesis. As these elementary operations are all smooth, is also smooth. If , then , as well as , is smooth via similar reasoning based on the additional fact that and the squareroot operator is smooth on the set of positive real numbers. The above derivation then shows that the induction hypothesis is also true for if and if .
Consequently, by mathematical induction, the hypothesis is true for all pairs of and . In other words, is a smooth manifold map. Its inverse, denoted by , is given by and clearly smooth. Therefore, and its inverse are diffeomorphisms.
3 Lie group structure and biinvariant metric
In this section, we first construct a group structure and a biinvariant metric on the manifold , and then push them forward to the manifold via the Cholesky map. Parallel transport on SPD manifolds is also investigated. For a background of Riemannian geometry and Lie group, we recommend monographs [19, 24].
3.1 Riemannian geometry on Cholesky spaces
For matrices in , as offdiagonal elements in the lower triangular part are unconstrained while diagonal ones are restricted to be positive, can be parameterized by in the way that if and . This motivates us to respectively endow the unconstrained part and the positive diagonal part with a different metric and then combine them into a Riemannian metric on , as follows. First, we note that the tangent space of at a given is identified with the linear space . For such tangent space, we treat the strict lower triangular space as the Euclidean space with the usual Frobenius inner product for all . For the diagonal part , we equipped it with a different inner product defined by . Finally, combining these two components together, we define a metric for tangent spaces (identified with by
It is straightforward to show that the space , equipped with the metric , is a Riemannian manifold. We begin with geodesics on the manifold to investigate its basic properties.
Proposition 3
On the Riemannian manifold , the geodesic starting at with direction is given by
Proof
Clearly, and . Now, we use to denote the vector in such that the first elements of correspond to the diagonal elements of . Define the map by
where denotes the th component of . It can be checked that is a chart for the manifold . Let be the dimensional vector whose th element is one and other elements are all zero. For , we define , and for , define . The collection is a frame. One can check that if , and . This implies that , and hence all Christoffel symbols are all zeros, as
where Einstein summation convention is assumed. It can be checked that the th coordinate of the curve is given by when and if . Now, it is an easy task to verify the following geodesic equations
for . Therefore, is the claimed geodesic.
Given the above proposition, we can immediately derive the Riemannian exponential map at , which is given by
Also, for , with , one has
Since and , is the geodesic connecting and . Therefore, the distance function on induced by , denoted by , is given by
where is the same as the above. The expression for the distance function can be equivalently and more compactly written as
Table 1 summarizes the above basic properties of the manifold .
tangent space at 

Riemannian metric 
geodesic emanating from with direction 
Riemannian exponential map at 
Riemannian logarithmic map at 
geodesic distance between and 
3.2 Riemannian metric for SPD matrices
As mentioned previously, the space of SPD matrices is a smooth submanifold of the space of symmetric matrices, whose tangent space at a given SPD matrix is identified with . We also showed that the map by is a diffeomorphism between and . For a square matrix , we define a lower triangular matrix . In another word, the matrix is the lower triangular part of with the diagonal elements halved. The differential of is given in the following.
Proposition 4
The differential of at is given by
Also, the inverse of exists for all and is given by
for .
Proof
Let and . Then is a curve passing through if for a sufficiently small . Note that for every such , . Then is a curve passing through . The differential is then derived from
On the other hand, if , then since is invertible, we have
Note that is also a lower triangular matrix, and the matrix on the left hand side is symmetric, we deduce that , which gives rise to . The linear map is onetoone, since from the above derivation, if and only if .
Given the above proposition, the manifold map , which is exactly the inverse of the Cholesky map discussed in Section 2.2, induces a Riemannian metric on , denoted by and called LogCholesky metric, given by
(2) 
where is the Cholesky factor of , and are tangent vectors at . This implies that
for all and . According to Definition 7.57 of [26], the map is an isometry between and . A Riemannian isometry provides correspondence of Riemannian properties and objects between two Riemannian manifolds. This enables us to study the properties of via the manifold and the isometry . For example, we can obtain geodesics on by mapping geodesics on . More precisely, the geodesic emanating from with is given by
where and . Similarly, the Riemannian exponential at is given by
while the geodesic between and is characterized by
with , , and . Also, the geodesic distance between and is
Moreover, the LeviCivita connection of can be obtained by the LeviCivita connection of . To see this, let and be two smooth vector fields on . Define vector fields and on by and for all . Then , and the Christoffel symbols to compute the connection has been given in the proof of creftype 3.
Table 2 summarizes some basic properties of the manifold . Note that the differential can be computed efficiently, since it only involves Cholesky decomposition and the inverse of a lower triangular matrix, for both of which there exist efficient algorithms. Consequently, all maps in Table 2 can be evaluated in an efficient way. In contrast, computation of Riemannian exponential/logarithmic maps for the LogEuclidean metric [4] requires evaluation of some series of an infinite number of terms; see Eq. (2.1) and Table 4.1 of [4].
tangent space at 
differential of at 
differential of at 
Riemannian metric 
geodesic emanating from with direction 
Riemannian exponential map at 
Riemannian logarithmic map at 
geodesic distance between and 
. 
3.3 Lie group structure and biinvariant metrics
We define an operator on by
Note that . Moreover, if , then . It is not difficult to see that is a smooth commutative group operation on the manifold , where the inverse of , denoted by , is . The left translation by is denoted by . One can check that the differential of this operation at is
(3) 
where it is noted that the differential does not depend on . Given the above expression, one can find that
Similar observations are made for right translations. Thus, the metric is a biinvariant metric that turns into a Lie group.
The group operator and maps and together induce a smooth operation on , defined by
In addition, both and are Riemannian isometries and group isomorphisms between Lie groups and .
Theorem 5
The space is an abelian Lie group. Moreover, the metric defined in Eq. 2 is a biinvariant metric on .
Proof
It is clear that is closed under the operation
, and the identity element is the identity matrix. For
, the inverse under is given by . For associativity, we first observe that , based on which we further deduce thatTherefore, is a group. The commutativity and smoothness of stem from the commutativity and smoothness of , respectively. It can be checked that is a group isomorphism and isometry between Lie groups and respectively endowed with Riemannian metrics and . Then, the biinvariance of follows from the biinvariance of .
3.4 Parallel transport along geodesics on
In some applications like statistical analysis or machine learning on Riemannian manifolds, parallel transport of tangent vectors along geodesics is required. For instance, in
[40]that studies regression on SPDvalued data, tangent vectors are transported to a common place to derive statistical estimators of interest. Also, optimization in the context of statistics for manifoldvalued data often involves parallel transport of tangent vectors. Examples include Hamiltonian Monte Carlo algorithms
[23] as well as optimization algorithms [20]to train manifoldvalued Gaussian mixture models. In these scenarios, Riemannian metrics on SPD matrices that result in efficient computation for parallel transport are attractive, in particular in the era of big data. In this regard, as discussed in the introduction, evaluation of parallel transport along geodesics for the affineinvariant metric is simple and efficient in computation, while the one for the LogEuclidean metric is computationally intensive. Below we show that parallel transport for the presented metric also has a simple form, starting with a lemma.
Lemma 6
Let be an abelian Lie group with a biinvariant metric. The parallel transporter that transports tangent vectors at to tangent vectors at along geodesics connecting and is given by for .
Proof
For simplicity, we abbreviate as . Let denote the Lie algebra associated with the Lie group , and the LeviCivita connection on . Note that we identify elements in with leftinvariant vector fields on . We shall first recall that for (see the proof of Theorem 21.3 in [29]), where denotes the Lie bracket of . As is abelian, the Lie bracket vanishes everywhere and hence if .
Let for such that , where denotes the Lie group exponential map. Recall that for a biinvariant Lie group, the group exponential map coincides with the Riemannian exponential map at the group identity , and left translations are isometries. Thus, is a geodesic. Using the fact according to Lemma 21.2 of [29]
, by the chain rule of differential, we have
from which we further deduce that
Now define a vector field
Comments
There are no comments yet.