invert4geom.optimize_regional_constraint_point_minimization#
- optimize_regional_constraint_point_minimization(testing_training_df, grid_method, grav_ds, n_trials, tension_factor_limits=(0, 1), spline_damping_limits=None, depth_limits=None, block_size_limits=None, damping_limits=None, grav_obs_height_limits=None, sampler=None, plot=False, plot_grid=False, fold_progressbar=False, optimize_on_true_regional_misfit=False, separate_metrics=True, score_as_median=False, true_regional=None, progressbar=True, parallel=False, fname=None, seed=0, **kwargs)[source]#
Run an Optuna optimization to find the optimal hyperparameters for the Constraint Point Minimization technique for estimating the regional component of gravity misfit. Since constraints are used both for determining the regional field, and for the scoring of the performance, we must split the constraints into testing and training sets. This function can perform both single and K-Folds cross validations, determined by the number of “fold_x” columns in testing_training_df. If using more than one fold, the score for each parameter set is the mean of the scores of each fold. The total number of regional separation this will perform is n_trials*K-folds. This function then uses the optimal parameter values to redo the regional estimation using all the constraints points, not just the training points, and returns the results. By default this will perform a multi-objective optimization to find the best trade-off between the lowest RMSE of the residual misfit at the constraints and the highest RMS amplitude of the residual at all locations. Choose which Constraint Point Minimization gridding method with the grid_method parameter, and supplied the associated method parameter limits via parameters <parameter>_limits. For grid method “eq_sources” which has multiple parameters, if limits aren’t provided for one of the parameters, supply a constant value for the parameter in the keyword arguments, which are past direction to
DatasetAccessorInvert4Geom.regional_separation. For synthetic testing, if the true regional grid is provided, the optimization can be set to optimize on the RMSE of the predicted and true regional gravity, by setting optimize_on_true_regional_misfit=True.- Parameters:
testing_training_df (
DataFrame) – constraints dataframe with columns “easting”, “northing”, “upward”, and a column for each fold in the format “fold_0”, “fold_1”, etc. This can be created with function cross_validation.split_test_train(). Each fold column should have strings of “test” or “train” to indicate which rows are testing or training points. If more than one fold is provided, this function will perform a K-Folds cross validation and the score for each set of parameters will be the mean of the K-scores.grid_method (
str) – constraint point minimization method to use, choose between “verde” for bi-harmonic spline gridding, “pygmt” for tensioned minimum curvature gridding, or “eq_sources” for equivalent sources gridding.grav_ds (
Dataset) – gravity dataset with coordinates “easting”, “northing”, and variables “reg” and gravity_anomaly.n_trials (
int) – number of trials to runtension_factor_limits (
tuple[float,float]) – limits to use for the PyGMT tension factor gridding, by default (0, 1)spline_damping_limits (
tuple[float,float] |None) – limits to use for the Verde bi-harmonic spline damping, by default Nonedepth_limits (
tuple[float,float] |None) – limits to use for the equivalent sources’ depths, by default Noneblock_size_limits (
tuple[float,float] |None) – limits to use for the block size for fitting equivalent sources, by default Nonedamping_limits (
tuple[float,float] |None) – limits to use for the damping value for fitting equivalent sources, by default Nonegrav_obs_height_limits (
tuple[float,float] |None) – limits to use for the gravity observation height for fitting equivalent sources, by default Nonesampler (
BaseSampler|None) – customize the optuna sampler, by default TPE samplerplot (
bool) – plot the resulting optimization figures, by default Falseplot_grid (
bool) – plot the resulting regional gravity grid, by default Falsefold_progressbar (
bool) – turn on or off a progress bar for the optimization of each fold if performing a K-Folds cross-validation within the optimization, by default Falseoptimize_on_true_regional_misfit (
bool) – if true_regional grid is provide, choose to perform optimization on the RMSE between the true regional and the estimated region, by default Falseseparate_metrics (
bool) – if False, returns the scores combined with the formula residual_constraints_score / residual_amplitude_score, by default is True and returns both the residual and regional scores separately.score_as_median (
bool) – use the root median square instead of the root mean square for the scoring metric, by default Falsetrue_regional (
DataArray|None) – if the true regional gravity is known (in synthetic models), supply this as a grid to include a user_attr of the RMSE between this and the estimated regional for each trial, or set optimize_on_true_regional_misfit=True to have the optimization optimize on the RMSE, by default Noneprogressbar (
bool) – add a progressbar, by default Trueparallel (
bool) – run the optimization in parallel, by default Falsefname (
str|None) – file name to save the study to, by default Noneseed (
int) – random seed for the samplers, by default 0kwargs (
Any) – additional keyword arguments to pass to theDatasetAccessorInvert4Geom.regional_separation
- Return type:
- Returns:
study (optuna.study.Study,) – the completed Optuna study
resulting_grav_ds (xarray.Dataset) – the resulting gravity dataset of the best trial
best_trial (optuna.trial.FrozenTrial) – the best trial