coordinate_covariance¶
- sofia_redux.toolkit.resampling.coordinate_covariance(coordinates, mean=None, mask=None, dof=1)[source]¶
Calculate the covariance of a distribution.
Given the sample distribution of \(N\)
coordinates
(\(X\)) in \(K\) dimensions, the sample covariance is given as:\[\Sigma = E[(X - E[X])(X - E[X])^T]\]where \(\Sigma\) is a \(K \times K\) matrix and \(E\) denotes the expected value. In the general case where the expected value of \(X\) is unknown and derived from the distribution itself, the covariance of the samples between dimension \(i\) and \(j\) is:
\[\Sigma_{ij} = \frac{1}{N - M} \sum_{k=1}^{N} {(X_{ki} - \bar{X}_i)(X_{kj} - \bar{X}_j)}\]where \(M\) is the number of degrees of freedom lost (
dof
) in determining themean
(\(\bar{X}\)). If themean
is not provided, it will be calculated usingcoordinate_mean()
in which case the defaultdof
of 1 is appropriate.- Parameters:
- coordinatesnumpy.ndarray (n_dimensions, n_coordinates)
The coordinates of the distribution.
- meannumpy.ndarray (n_dimensions,), optional
The mean of the coordinate distribution in each dimension. If not provided, the expected value in each dimension will be calculated using
coordinate_mean()
.- masknumpy.ndarray (n_coordinates,), optional
An array of bool values where
True
indicates a coordinate should be included in the calculation, andFalse
indicates that a coordinate should be ignored. By default, all coordinates are included.- dofint or float, optional
The lost degrees of freedom, typically 1 to indicate that the population mean is not known and is replaced by the sample mean.
- Returns:
- covariancenumpy.ndarray of numpy.float64 (n_dimensions, n_dimensions)
The covariance of the sample distribution.