adler.utilities.science_utilities

Functions

outlier_diff(new_res[, diff_cut])

Test whether new data point(s) is an outlier compared to the model by considering the residual difference.

outlier_std(new_res, data_res[, std_cut])

Test whether new data point(s) is an outlier compared to the model by considering the standard deviation of the residuals.

zero_func(x[, axis])

Dummy function to return a zero.

sigma_clip(data_res, **kwargs)

Wrapper function for astropy.stats.sigma_clip, here we define the default centre of the data (the data - model residuals) to be zero

outlier_sigma_diff(data_res, data_sigma[, std_sigma])

Function to identify outliers by comparing the uncertainty of measurements to their residuals

apparition_gap_finder(x[, dx])

Function to find gaps in a data series. E.g. given an array of observation times, find the different apparitions of an asteroid from gaps between observations larger than a given value.

get_df_obs_filt(planetoid, filt[, x_col, x1, x2, ...])

Retrieve a dataframe of observations in a given filter. Has the option to limit the observations to a range of values, e.g. times/phase angles, if required.

large_magErr_mask(df_obs[, magErr_percentile_cut])

# TODO docstring

split_obs(df_obs, process_mjd[, n_new_nights])

# TODO docstring

running_stats(N, sum_x, sum_x2)

Function to calculate the running mean and std statistics from the number, sum and sum of the squares of a dataset.

execute_subprocess(cmd)

Wrapper function to execute a terminal command using the python subprocess module

Module Contents

outlier_diff(new_res, diff_cut=1.0)[source]

Test whether new data point(s) is an outlier compared to the model by considering the residual difference.

Parameters:
  • new_res (array) – The residuals of the new data points compared to the model

  • diff_cut (float) – The threshold difference value for outlier detection.

Returns:

outlier_flag – Array of flag indicating if data point is an outlier (True)

Return type:

array

outlier_std(new_res, data_res, std_cut=3.0)[source]

Test whether new data point(s) is an outlier compared to the model by considering the standard deviation of the residuals.

Parameters:
  • new_res (array) – The residuals of the new data point(s) compared to the model

  • data_res (array) – The residuals of the data compared to the model.

  • std_cut (float) – The threshold standard deviation for outlier detection.

Returns:

outlier_flag – Array of flag indicating if data point is an outlier (True)

Return type:

array

zero_func(x, axis=None)[source]

Dummy function to return a zero. Can be used as the centre function in astropy.stats.sigma_clip to get std relative to zero rather than median/mean value.

Parameters:
  • x – Dummy variable

  • axis – required to match the syntax of numpy functions such as np.median

sigma_clip(data_res, **kwargs)[source]

Wrapper function for astropy.stats.sigma_clip, here we define the default centre of the data (the data - model residuals) to be zero

Parameters:
  • data_res (array) – The residuals of the data compared to the model.

  • kwargs (dict) – Dictionary of keyword arguments from astropy.stats.sigma_clip, namely: sigma : default 3 maxiters: default 5 cenfunc: default ‘median’

Returns:

sig_clip_mask – returns only the mask from astropy.stats.sigma_clip

Return type:

array

outlier_sigma_diff(data_res, data_sigma, std_sigma=1)[source]

Function to identify outliers by comparing the uncertainty of measurements to their residuals

Parameters:
  • data_res (array) – The residuals of the data compared to the model.

  • data_sigma (array) – The uncertainties of the data points

  • std_sigma (float) – Number of standard deviations to identify outliers, assuming the data uncertainties represent one standard deviation

Returns:

outlier_flag – Array of flag indicating if data point is an outlier (True)

Return type:

array

apparition_gap_finder(x, dx=100.0)[source]

Function to find gaps in a data series. E.g. given an array of observation times, find the different apparitions of an asteroid from gaps between observations larger than a given value.

Parameters:
  • x (array) – The SORTED data array to search for gaps

  • dx (float) – The size of gap to identify in data series

Returns:

x_gaps – Values of x which define the groups in the data, where each group is x_gaps[i] <= x < x_gaps[i+1]

Return type:

array

get_df_obs_filt(planetoid, filt, x_col='midPointMjdTai', x1=None, x2=None, col_list=None, pc_model=None)[source]

Retrieve a dataframe of observations in a given filter. Has the option to limit the observations to a range of values, e.g. times/phase angles, if required.

Parameters:
  • planetoid (object) – Adler planetoid object containging the observations

  • filt (str) – The filter to query

  • x_col (str) – Column name to use for ordering values and limiting observations to a range: x1 <= df_obs[x_col] <= x2

  • x1 (float) – Lower limit value for x_col

  • x2 (float) – Upper limit value for x_col

  • col_list (list) – List of column names to retrieve in addition to x_col, otherwise all columns are retrieved. N.B. if AbsMag is included in col_list then the absolute magnitude is calculated for the phase curve model pc_model

  • pc_model (object) – Adler PhaseCurve model used to calculate AbsMag if required

Returns:

df_obs – DataFrame of observations in the requested filter, ordered by x_col and with any x_col limits applied

Return type:

DataFrame

large_magErr_mask(df_obs, magErr_percentile_cut=95)[source]

# TODO docstring

split_obs(df_obs, process_mjd, n_new_nights=3)[source]

# TODO docstring

running_stats(N, sum_x, sum_x2)[source]

Function to calculate the running mean and std statistics from the number, sum and sum of the squares of a dataset. This function allows us to calculate stats for a dataset where we do not record every value, we record only the three summary values: https://en.wikipedia.org/wiki/Standard_deviation#Rapid_calculation_methods

Parameters:
  • N (int) – Number of data points, x

  • sum_x (float) – Sum of all data points, x

  • sum_x2 (float) – Sum of the square of each data point, x**2

Returns:

mean_x, std_x – Then mean and std of the data, x

Return type:

float, float

execute_subprocess(cmd)[source]

Wrapper function to execute a terminal command using the python subprocess module

Parameters:

cmd (str) – Command to be executed

Returns:

out, err – The output and any error messages returned by subprocess

Return type:

str, str