invert4geom.cross_validation#

Module Contents#

Functions#

resample_with_test_points(data_spacing, data, region)

take a dataframe of coordinates and make all rows that fall on the data_spacing

grav_cv_score(training_data, testing_data[, ...])

Find the score, represented by the root mean squared error (RMSE), between the

grav_optimal_parameter(training_data, testing_data, ...)

Calculate the cross validation scores for a set of parameter values and return the

constraints_cv_score(grav, constraints, **kwargs)

Find the score, represented by the root mean squared error (RMSE), between the

resample_with_test_points(data_spacing, data, region)[source]#

take a dataframe of coordinates and make all rows that fall on the data_spacing grid training points. Add rows at each point which falls on the grid points of half the data_spacing, assign these with label “test”. If other data is present in dataframe, will sample at each new location.

Parameters:
  • data_spacing (float) – full spacing size which will be halved

  • data (pd.DataFrame) – dataframe with coordinate columns “easting” and “northing”, all other columns will be sampled at new grid spacing

  • region (tuple[float, float, float, float]) – region to create grid over, in the form (min_easting, max_easting, min_northing, max_northing)

Returns:

a new dataframe with new column “test” of booleans which shows whether each row is a testing or training point.

Return type:

pd.DataFrame

grav_cv_score(training_data, testing_data, progressbar=False, plot=False, **kwargs)[source]#

Find the score, represented by the root mean squared error (RMSE), between the testing gravity data, and the predict gravity data after and inversion. Follows methods of Uieda and Barbosa[1].

Parameters:
  • training_data (pd.DataFrame) – rows of the data frame which are just the training data

  • testing_data (pd.DataFrame) – rows of the data frame which are just the testing data

  • progressbar (bool, optional) – choose to show the progress bar for the forward gravity calculation, by default False

  • plot (bool, optional) – choose to plot the observed and predicted data grids, and their difference, located at the testing points, by default False

  • kwargs (Any)

Returns:

a score, represented by the root mean squared error, between the testing gravity data and the predicted gravity data.

Return type:

float

References

Uieda and Barbosa[1]

grav_optimal_parameter(training_data, testing_data, param_to_test, progressbar=False, plot_grids=False, plot_cv=False, verbose=False, **kwargs)[source]#

Calculate the cross validation scores for a set of parameter values and return the best score and value.

Parameters:
  • training_data (pd.DataFrame) – just the training data rows

  • testing_data (pd.DataFrame) – just the testing data rows

  • param_to_test (tuple[str, list[float]]) – first value is a string of the parameter that is being tested, and the second value is a list of the values to test

  • progressbar (bool, optional) – display a progress bar for the number of tested values, by default False

  • plot_grids (bool, optional) – plot all the grids of observed and predicted data for each parameter value, by default False

  • plot_cv (bool, optional) – plot a graph of scores vs parameter values, by default False

  • verbose (bool, optional) – log the results, by default False

  • kwargs (Any)

Returns:

the optimal parameter value, the score associated with it, the parameter values and the scores for each parameter value

Return type:

tuple[float, float, list[float], list[float]]

constraints_cv_score(grav, constraints, **kwargs)[source]#

Find the score, represented by the root mean squared error (RMSE), between the constraint point elevation, and the inverted topography at the constraint points. Follows methods of Uieda and Barbosa[1].

Parameters:
  • grav (pd.DataFrame) – gravity dataframe with columns “res”, “reg”, and column set by kwarg input_grav_column

  • constraints (pd.DataFrame) – constraints dataframe with columns “easting”, “northing”, and “upward”

  • kwargs (Any)

Returns:

a score, represented by the root mean squared error, between the testing gravity data and the predicted gravity data.

Return type:

float

References