adler.utilities.science_utilities
Functions
|
Test whether new data point(s) is an outlier compared to the model by considering the residual difference. |
|
Test whether new data point(s) is an outlier compared to the model by considering the standard deviation of the residuals. |
|
Dummy function to return a zero. |
|
Wrapper function for astropy.stats.sigma_clip, here we define the default centre of the data (the data - model residuals) to be zero |
|
Function to identify outliers by comparing the uncertainty of measurements to their residuals |
|
Function to find gaps in a data series. E.g. given an array of observation times, find the different apparitions of an asteroid from gaps between observations larger than a given value. |
|
Retrieve a dataframe of observations in a given filter. Has the option to limit the observations to a range of values, e.g. times/phase angles, if required. |
|
# TODO docstring |
|
# TODO docstring |
|
Function to calculate the running mean and std statistics from the number, sum and sum of the squares of a dataset. |
|
Wrapper function to execute a terminal command using the python subprocess module |
Module Contents
- outlier_diff(new_res, diff_cut=1.0)[source]
Test whether new data point(s) is an outlier compared to the model by considering the residual difference.
- Parameters:
new_res (array) – The residuals of the new data points compared to the model
diff_cut (float) – The threshold difference value for outlier detection.
- Returns:
outlier_flag – Array of flag indicating if data point is an outlier (True)
- Return type:
array
- outlier_std(new_res, data_res, std_cut=3.0)[source]
Test whether new data point(s) is an outlier compared to the model by considering the standard deviation of the residuals.
- Parameters:
new_res (array) – The residuals of the new data point(s) compared to the model
data_res (array) – The residuals of the data compared to the model.
std_cut (float) – The threshold standard deviation for outlier detection.
- Returns:
outlier_flag – Array of flag indicating if data point is an outlier (True)
- Return type:
array
- zero_func(x, axis=None)[source]
Dummy function to return a zero. Can be used as the centre function in astropy.stats.sigma_clip to get std relative to zero rather than median/mean value.
- Parameters:
x – Dummy variable
axis – required to match the syntax of numpy functions such as np.median
- sigma_clip(data_res, **kwargs)[source]
Wrapper function for astropy.stats.sigma_clip, here we define the default centre of the data (the data - model residuals) to be zero
- Parameters:
data_res (array) – The residuals of the data compared to the model.
kwargs (dict) – Dictionary of keyword arguments from astropy.stats.sigma_clip, namely: sigma : default 3 maxiters: default 5 cenfunc: default ‘median’
- Returns:
sig_clip_mask – returns only the mask from astropy.stats.sigma_clip
- Return type:
array
- outlier_sigma_diff(data_res, data_sigma, std_sigma=1)[source]
Function to identify outliers by comparing the uncertainty of measurements to their residuals
- Parameters:
data_res (array) – The residuals of the data compared to the model.
data_sigma (array) – The uncertainties of the data points
std_sigma (float) – Number of standard deviations to identify outliers, assuming the data uncertainties represent one standard deviation
- Returns:
outlier_flag – Array of flag indicating if data point is an outlier (True)
- Return type:
array
- apparition_gap_finder(x, dx=100.0)[source]
Function to find gaps in a data series. E.g. given an array of observation times, find the different apparitions of an asteroid from gaps between observations larger than a given value.
- Parameters:
x (array) – The SORTED data array to search for gaps
dx (float) – The size of gap to identify in data series
- Returns:
x_gaps – Values of x which define the groups in the data, where each group is x_gaps[i] <= x < x_gaps[i+1]
- Return type:
array
- get_df_obs_filt(planetoid, filt, x_col='midPointMjdTai', x1=None, x2=None, col_list=None, pc_model=None)[source]
Retrieve a dataframe of observations in a given filter. Has the option to limit the observations to a range of values, e.g. times/phase angles, if required.
- Parameters:
planetoid (object) – Adler planetoid object containging the observations
filt (str) – The filter to query
x_col (str) – Column name to use for ordering values and limiting observations to a range: x1 <= df_obs[x_col] <= x2
x1 (float) – Lower limit value for x_col
x2 (float) – Upper limit value for x_col
col_list (list) – List of column names to retrieve in addition to x_col, otherwise all columns are retrieved. N.B. if AbsMag is included in col_list then the absolute magnitude is calculated for the phase curve model pc_model
pc_model (object) – Adler PhaseCurve model used to calculate AbsMag if required
- Returns:
df_obs – DataFrame of observations in the requested filter, ordered by x_col and with any x_col limits applied
- Return type:
DataFrame
- running_stats(N, sum_x, sum_x2)[source]
Function to calculate the running mean and std statistics from the number, sum and sum of the squares of a dataset. This function allows us to calculate stats for a dataset where we do not record every value, we record only the three summary values: https://en.wikipedia.org/wiki/Standard_deviation#Rapid_calculation_methods
- Parameters:
N (int) – Number of data points, x
sum_x (float) – Sum of all data points, x
sum_x2 (float) – Sum of the square of each data point, x**2
- Returns:
mean_x, std_x – Then mean and std of the data, x
- Return type:
float, float