coordinate_covariance

sofia_redux.toolkit.resampling.coordinate_covariance(coordinates, mean=None, mask=None, dof=1)[source]

Calculate the covariance of a distribution.

Given the sample distribution of \(N\) coordinates (\(X\)) in \(K\) dimensions, the sample covariance is given as:

\[\Sigma = E[(X - E[X])(X - E[X])^T]\]

where \(\Sigma\) is a \(K \times K\) matrix and \(E\) denotes the expected value. In the general case where the expected value of \(X\) is unknown and derived from the distribution itself, the covariance of the samples between dimension \(i\) and \(j\) is:

\[\Sigma_{ij} = \frac{1}{N - M} \sum_{k=1}^{N} {(X_{ki} - \bar{X}_i)(X_{kj} - \bar{X}_j)}\]

where \(M\) is the number of degrees of freedom lost (dof) in determining the mean (\(\bar{X}\)). If the mean is not provided, it will be calculated using coordinate_mean() in which case the default dof of 1 is appropriate.

Parameters:
coordinatesnumpy.ndarray (n_dimensions, n_coordinates)

The coordinates of the distribution.

meannumpy.ndarray (n_dimensions,), optional

The mean of the coordinate distribution in each dimension. If not provided, the expected value in each dimension will be calculated using coordinate_mean().

masknumpy.ndarray (n_coordinates,), optional

An array of bool values where True indicates a coordinate should be included in the calculation, and False indicates that a coordinate should be ignored. By default, all coordinates are included.

dofint or float, optional

The lost degrees of freedom, typically 1 to indicate that the population mean is not known and is replaced by the sample mean.

Returns:
covariancenumpy.ndarray of numpy.float64 (n_dimensions, n_dimensions)

The covariance of the sample distribution.