savitzky_golay¶
- sofia_redux.toolkit.convolve.kernel.savitzky_golay(*args, order=2, error=1, mask=None, do_error=False, robust=0, eps=0.01, maxiter=100, model=False, scale=False, ignorenans=True, **kwargs)[source]¶
Apply a least-squares (Savitzky-Golay) polynomial filter
Convenience wrapper for toolkit.convolve.kernel.SavgolConvolve
Any NaN or False “mask” values will be interpolated over before filtering. Error will be propagated correctly and will account for interpolation over masked/NaN values or outliers removed during “robust” iteration.
Arguments are supplied in the following order: [x0, x1, … xn], values, window, [optional parameters]
where x are dimension coordinates supplied in the same order as the axes of values. Therefore, if you supplied arguments in the order x, y, z then you should expect the zeroth axis of z to relate to the x-coordinates and the first axis of z to relate to the y-coordinates. This is contrary to standard numpy (row (y), col (x)) ordering. For consistency with numpy, one would supply the arguments in the y, x, z order.
The x0, x1,… independent values do not need to be supplied, and instead, the features of values will be used to generate a regular grid of coordinates. Be aware however that the independent variables are used to rescale window widths if “scale” is set to True, and also used for linear interpolation over masked, NaN, or identified outliers (if “robust”).
- Parameters:
- args(ndim + 2)-tuple or 2-tuple
args[-1] contains the filtering window width. This should be a positive integer greater than order and less than the size of the image.
args[-2] contains the dependent variables as an array of shape (shape). If independent variables were supplied, (shape) must match across args[:ndim + 1]. Otherwise, the dimension and shape is determined by args[-2].
args[:ndim] contain the independent variables. These are used for interpolation purposes or “scale” and should be of size (shape). However, they do not need to be specified if interpolation is being carried out on a regular grid. In this case, the user should just supply the dependent variables as an array of the correct dimension and shape.
- orderint, optional
Order of polynomial to use in designing the filter. Higher orders produce sharper features.
- errorfloat or array_like of float (shape), optional
Dependent variable errors to propagate
- maskarray_like or bool (shape), optional
Mask indicating good (True) and bad (False) samples. Bad values will not be included in the fit and will be replaced with linear interpolation prior to filtering.
- do_errorbool, optional
If True, return the propagated error in addition to the filtered data.
- robustint or float, optional
If > 0, taken as the sigma threshold above which to identify outliers. Outliers are those identified as abs(x_i - x_med) / MAD > “robust” where x is the residual of (data - fit) x_med is the median, MAD is the median absolute deviation defined as 1.482 * median(abs(x_i - x_med)). The fit will be iterated upon until the set of identified outliers does not change, or any change in the relative RMS is less than “eps”, or “maxiter” iterations have occured.
- epsfloat, optional
The limit of fractional check in RMS if “robust” is > 0
- maxiterint, optional
The maximum number iterations if “robust” is > 0
- modelbool, optional
If set, return the fitting model with lots of nice little methods and statistics to play with.
- scalebool, optional
If True, scale “window” to the average spacing between samples over each dimension. Note that this replaces “width” in the old IDL version. This option should not be used if working in multiple non-orthogonal features, as average spacing per dimension is taken as the average separation between ordered dimensional coordinates.
- ignorenansbool, optional
If True (default and highly recommended), NaNs will be removed and interpolated over.
- kwargsdict, optional.
Optional keywords and values that would be passed into scipy.signal.savgol_filter.
- Returns:
- filtered_datanumpy.ndarray of numpy.float64 (shape) or Model
The filtered data or an instance of toolkit.base.Model.
Notes
Edge conditions are handled differently according to whether the data is one dimensional or not. If it is, then extrapolation is permitted. However, extrapolation is not permitted in features of > 2. In these cases, everything should work as expected unless the corner samples are NaN. Interpolation is not possible and NaNs will be applied to the filtered array according to width/2 of the window centered on the corner NaN.
Also note, the IDL version “robustsg” was only 1-dimensional and savitzky_golay will perform exactly the same in this case.
Examples
Generate coefficients for cubic-polynomial with a window of 5
>>> import numpy as np >>> x = np.arange(5) >>> y = [0, 0, 1, 0, 0] >>> filtered = savitzky_golay(x, y, 5, order=2) >>> assert np.allclose(filtered, [-3/35, 12/35, 17/35, 12/35, -3/35])
2-dimensional example
>>> (y, x), z = np.mgrid[:11, :11], np.zeros((11, 11)) >>> z[5, 5] = 1 >>> x = x / 2
Coordinates are inferred if not supplied
>>> filtered1 = savitzky_golay(x, y, z, 5, order=2) >>> filtered2 = savitzky_golay(z, 5, order=2) >>> assert np.allclose(filtered1, filtered2) >>> filtered3 = savitzky_golay(x, y, z, 5, order=2, scale=True) >>> print(np.sum(filtered1 != 0)) # window = (5, 5) x, y pixels 25 >>> print(np.sum(filtered3 != 0)) # window = (11, 5) x, y, pixels 55
Note that in the last case, scale scaled the x window width to 11 pixels since the average spacing in x is 0.5. i.e. a window width of 1 is equal to 2 x-pixels. Therefore, the window is expanded to 2 * 5 = 10, and then an extra pixel is added on to ensure that the filter is centered (odd number of pixels).