invert4geom.optimization
========================

.. py:module:: invert4geom.optimization


Classes
-------

.. autoapisummary::

   invert4geom.optimization.OptimalInversionDamping
   invert4geom.optimization.OptimalInversionZrefDensity
   invert4geom.optimization.OptimalEqSourceParams
   invert4geom.optimization.OptimizeRegionalTrend
   invert4geom.optimization.OptimizeRegionalFilter
   invert4geom.optimization.OptimizeRegionalEqSources
   invert4geom.optimization.OptimizeRegionalConstraintsPointMinimization
   invert4geom.optimization.OptimalBuffer
   invert4geom.optimization.DuplicateIterationPruner


Functions
---------

.. autoapisummary::

   invert4geom.optimization.available_cpu_count
   invert4geom.optimization.run_optuna
   invert4geom.optimization._optuna_set_cores
   invert4geom.optimization._warn_parameter_at_limits
   invert4geom.optimization._log_optuna_results
   invert4geom.optimization._create_regional_separation_study
   invert4geom.optimization._logging_callback
   invert4geom.optimization._warn_limits_better_than_trial_1_param
   invert4geom.optimization._warn_limits_better_than_trial_multi_params
   invert4geom.optimization.optimize_inversion_damping
   invert4geom.optimization.optimize_inversion_zref_density_contrast
   invert4geom.optimization.optimize_inversion_zref_density_contrast_kfolds
   invert4geom.optimization.optimize_eq_source_params
   invert4geom.optimization.optimize_regional_filter
   invert4geom.optimization.optimize_regional_trend
   invert4geom.optimization.optimize_regional_eq_sources
   invert4geom.optimization.optimize_regional_constraint_point_minimization
   invert4geom.optimization.optimal_buffer


Module Contents
---------------

.. py:function:: available_cpu_count()

   Number of available virtual or physical CPUs on this system, i.e.
   user/real as output by time(1) when called with an optimally scaling
   userspace-only program

   Adapted from https://stackoverflow.com/a/1006301/18686384


.. py:function:: run_optuna(study, objective, n_trials, storage = None, maximize_cpus = True, parallel = False, progressbar = None, callbacks = None)

   Run optuna optimization, optionally in parallel. Pre-define the study, and objective
   function, and if parallel is True, the storage (preferably with JournalStorage) and
   study name.


.. py:function:: _optuna_set_cores(n_trials, optimize_study, study_name, storage, objective, max_cores = True)

   Set up optuna optimization in parallel splitting up the number of trials over either
   all available cores or giving each available core 1 trial.


.. py:function:: _warn_parameter_at_limits(trial)

   Warn if any best parameter values are at their limits

   :param trial: optuna trial, most likely should be the best trial.
   :type trial: optuna.trial.FrozenTrial


.. py:function:: _log_optuna_results(trial)

   Log the results of an optuna trial

   :param trial: optuna trial
   :type trial: optuna.trial.FrozenTrial


.. py:function:: _create_regional_separation_study(optimize_on_true_regional_misfit, separate_metrics, sampler, true_regional = None, parallel = True, fname = None)

   Creates a study, sets directions and metric names based on the input parameters.

   :param optimize_on_true_regional_misfit: choose to optimize on the true regional misfit instead of the residual misfit at
                                            constraints and the residual misfit amplitude.
   :type optimize_on_true_regional_misfit: bool
   :param separate_metrics: choose to optimize on the residual misfit at constraints and the residual misfit
                            amplitude as separate metrics, as opposed to them as a ratio.
   :type separate_metrics: bool
   :param sampler: sampler object
   :type sampler: optuna.samplers.BaseSampler
   :param true_regional: grid of true regional values, by default None
   :type true_regional: xarray.DataArray | None, optional
   :param parallel: inform whether the study should be run in run in parallel, by default True. If
                    True, uses file storage, which slows down the optimization, but allows for
                    running in parallel.
   :type parallel: bool, optional
   :param fname: file name to save the study to, by default None
   :type fname: str | None, optional

   :returns: * **study** (*optuna.study.Study*) -- return a study object with direction, sampler, and metric names set
             * **storage** (*optuna.storages.BaseStorage | None*) -- return an optuna storage object if parallel is True, otherwise None


.. py:function:: _logging_callback(study, frozen_trial)

   custom optuna callback, only log trial info if it's the best value yet.

   :param study: optuna study
   :type study: optuna.study.Study
   :param frozen_trial: current trial
   :type frozen_trial: optuna.trial.FrozenTrial


.. py:function:: _warn_limits_better_than_trial_1_param(study, trial)

   custom optuna callback, warn if limits provide better score than current trial

   :param study: optuna study
   :type study: optuna.study.Study
   :param trial: current trial
   :type trial: optuna.trial.FrozenTrial


.. py:function:: _warn_limits_better_than_trial_multi_params(study, trial)

   custom optuna callback, warn if limits provide better score than current trial for
   multiple parameter optimization

   :param study: optuna study
   :type study: optuna.study.Study
   :param trial: current trial
   :type trial: optuna.trial.FrozenTrial


.. py:class:: OptimalInversionDamping(damping_limits, fname, plot_grids = False, **kwargs)

   Objective function to use in an Optuna optimization for finding the optimal damping
   regularization value for a gravity inversion. Used within function
   `optimize_inversion_damping()`.


   .. py:attribute:: fname


   .. py:attribute:: damping_limits


   .. py:attribute:: kwargs


   .. py:attribute:: plot_grids
      :value: False


   .. py:method:: __call__(trial)

      :param trial: the trial to run
      :type trial: optuna.trial

      :returns: the score of the eq_sources fit
      :rtype: float


.. py:function:: optimize_inversion_damping(training_df, testing_df, n_trials, damping_limits, n_startup_trials = None, score_as_median = False, sampler = None, grid_search = False, fname = None, plot_cv = True, plot_grids = False, logx = True, logy = True, progressbar = True, parallel = False, seed = 0, **kwargs)

   Use Optuna to find the optimal damping regularization parameter for a gravity
   inversion. The optimization aims to minimize the cross-validation score,
   represented by the root mean (or median) squared error (RMSE), between the testing
   gravity data, and the predict gravity data after and inversion. Follows methods of
   :footcite:t:`uiedafast2017`.

   Provide upper and low damping values, number of trials to run, and specify to let
   Optuna choose the best damping value for each trial or to use a grid search. The
   results are saved to a pickle file with the best inversion results and the study.

   :param training_df: rows of the gravity data frame which are just the training data
   :type training_df: pandas.DataFrame
   :param testing_df: rows of the gravity data frame which are just the testing data
   :type testing_df: pandas.DataFrame
   :param n_trials: number of damping values to try
   :type n_trials: int
   :param n_startup_trials: number of startup trials, by default is automatically determined
   :type n_startup_trials: int | None, optional
   :param damping_limits: upper and lower limits
   :type damping_limits: tuple[float, float]
   :param score_as_median: if True, changes the scoring from the root mean square to the root median
                           square, by default False
   :type score_as_median: bool, optional
   :param sampler: customize the optuna sampler, by default either GPsampler or GridSampler
                   depending on if grid_search is True or False
   :type sampler: optuna.samplers.BaseSampler | None, optional
   :param grid_search: search the entire parameter space between damping_limits in n_trial steps, by
                       default False
   :type grid_search: bool, optional
   :param fname: file name to save both study and inversion results to as pickle files, by
                 default fname is `tmp_x_damping_cv` where x is a random integer between 0 and
                 999 and will save study to <fname>_study.pickle and tuple of inversion results
                 to <fname>_results.pickle.
   :type fname: str, optional
   :param plot_cv: plot the cross-validation results, by default True
   :type plot_cv: bool, optional
   :param plot_grids: for each damping value, plot comparison of predicted and testing gravity data,
                      by default False
   :type plot_grids: bool, optional
   :param logx: make x axis of CV result plot on log scale, by default True
   :type logx: bool, optional
   :param logy: make y axis of CV result plot on log scale, by default True
   :type logy: bool, optional
   :param progressbar: add a progressbar, by default True
   :type progressbar: bool, optional
   :param parallel: run the optimization in parallel, by default False
   :type parallel: bool, optional
   :param seed: random seed for the samplers, by default 0
   :type seed: int, optional

   :returns: * **study** (*optuna.study*) -- the completed optuna study
             * **inv_results** (*tuple[pandas.DataFrame, pandas.DataFrame, dict[str, typing.Any],             float]*) -- a tuple of the inversion results: topography dataframe, gravity dataframe,
               parameter values and elapsed time.


.. py:class:: OptimalInversionZrefDensity(fname, grav_df, constraints_df, regional_grav_kwargs, zref = None, zref_limits = None, density_contrast_limits = None, density_contrast = None, starting_topography = None, starting_topography_kwargs = None, progressbar = True, **kwargs)

   Objective function to use in an Optuna optimization for finding the optimal values
   for zref and or density contrast values for a gravity inversion. This class is used
   within the function `optimize_inversion_zref_density_contrast`. If using constraint
   point minimization for the regional separation, split constraints into testing and
   training sets and provide the testing set to argument `constraints_df` and the
   training set to the `constraints_df` argument of `regional_grav_kwargs`. To perform
   K-folds cross-validation, provide lists of constraints dataframes to the parameters
   where each dataframe in each list corresponds to fold.


   .. py:attribute:: fname


   .. py:attribute:: grav_df


   .. py:attribute:: constraints_df


   .. py:attribute:: regional_grav_kwargs


   .. py:attribute:: zref_limits
      :value: None


   .. py:attribute:: density_contrast_limits
      :value: None


   .. py:attribute:: zref
      :value: None


   .. py:attribute:: density_contrast
      :value: None


   .. py:attribute:: starting_topography
      :value: None


   .. py:attribute:: starting_topography_kwargs


   .. py:attribute:: progressbar
      :value: True


   .. py:attribute:: kwargs


   .. py:method:: __call__(trial)

      :param trial: the trial to run
      :type trial: optuna.trial

      :returns: the score of the eq_sources fit
      :rtype: float


.. py:function:: optimize_inversion_zref_density_contrast(grav_df, constraints_df, n_trials, n_startup_trials = None, starting_topography = None, zref_limits = None, density_contrast_limits = None, zref = None, density_contrast = None, starting_topography_kwargs = None, regional_grav_kwargs = None, score_as_median = False, sampler = None, grid_search = False, fname = None, plot_cv = True, logx = False, logy = False, progressbar = True, parallel = False, fold_progressbar = True, seed = 0, **kwargs)

   Run an Optuna optimization to find the optimal zref and or density contrast values
   for a gravity inversion. The optimization aims to minimize the cross-validation
   score, represented by the root mean (or median) squared error (RMSE), between
   points of known topography and the inverted topography. Follows methods of
   :footcite:t:`uiedafast2017`. This can optimize for either zref, density contrast,
   or both at the same time. Provide upper and low limits for each parameter, number of
   trials and let Optuna choose the best parameter values for each trial or use a grid
   search to test all values between the limits in intervals of n_trials. The results
   are saved to a pickle file with the best inversion results and the study. Since each
   new set of zref and density values changes the starting model, for each set of
   parameters this function re-calculates the starting gravity, the gravity misfit
   and its regional and residual components. `regional_grav_kwargs` are passed to
   `regional.regional_separation`. Once the optimal parameters are found, the regional
   separation and inversion are performed again and saved to <fname>_results.pickle and
   the study is saved to <fname>_study.pickle.
   The constraint point minimization regional separation technique uses constraints
   points to estimate the regional field, and since constraints are used to calculating
   the scoring metric of this function, the constraints need to be separated into
   training (regional estimation) and testing (scoring) sets. To do this, supply the
   training constraints to`regional_grav_kwargs` via `method="constraint"` or
   `method="constraint_cv"` and `constraints_df`, and the testing constraints to this
   function as `constraints_df`.
   Typically there are not many constraints and omitting some of them from the training
   set will significantly impact the regional estimation. To help with this, we can use
   a K-Folds approach, where for each set of parameter values, we perform this entire
   procedure K times, each time with a different separation of training and testing
   points, called a fold. The score associated with that parameter set is the mean of
   the K scores. Once the optimal parameter values are found, we then repeat the
   inversion using all of the constraints in the regional estimation. For a K-folds
   approach, supply lists of dataframes containing only each fold's testing or training
   points to the two `constraints_df` arguments. To automatically perform the
   test/train split and K-folds optimization, you can also use the convenience function
   `optimize_inversion_zref_density_contrast_kfolds`.

   :param grav_df: gravity data frame with columns `easting`, `northing`, `upward`, and
                   `gravity_anomaly`
   :type grav_df: pandas.DataFrame
   :param constraints_df: constraints data frame with columns `easting`, `northing`, and `upward`, or list
                          of dataframes for each fold of a cross-validation
   :type constraints_df: pandas.DataFrame or list[pandas.DataFrame]
   :param n_trials: number of trials, if grid_search is True, needs to be a perfect square and >=16.
   :type n_trials: int
   :param n_startup_trials: number of startup trials, by default is automatically determined
   :type n_startup_trials: int | None, optional
   :param starting_topography: a starting topography grid used to create the prisms layers. If not provided,
                               must provide region, spacing and dampings to starting_topography_kwargs, by
                               default None
   :type starting_topography: xarray.DataArray | None, optional
   :param zref_limits: upper and lower limits for the reference level, in meters, by default None
   :type zref_limits: tuple[float, float] | None, optional
   :param density_contrast_limits: upper and lower limits for the density contrast, in kg/m^-3, by default None
   :type density_contrast_limits: tuple[float, float] | None, optional
   :param zref: if zref_limits not provided, must provide a constant zref value, by default None
   :type zref: float | None, optional
   :param density_contrast: if density_contrast_limits not provided, must provide a constant density
                            contrast value, by default None
   :type density_contrast: float | None, optional
   :param starting_topography_kwargs: dictionary with key: value pairs of "region":tuple[float, float, float, float].
                                      "spacing":float, and "dampings":float | list[float] | None, used to create
                                      a flat starting topography at each zref value if starting_topography not
                                      provided, by default None
   :type starting_topography_kwargs: dict[str, typing.Any] | None, optional
   :param regional_grav_kwargs: dictionary with kwargs to supply to `regional.regional_separation()`, by default
                                None
   :type regional_grav_kwargs: dict[str, typing.Any] | None, optional
   :param score_as_median: change scoring metric from root mean square to root median square, by default
                           False
   :type score_as_median: bool, optional
   :param sampler: customize the optuna sampler, by default uses GPsampler unless grid_search
                   is True, then uses GridSampler.
   :type sampler: optuna.samplers.BaseSampler | None, optional
   :param grid_search: Switch the sampler to GridSampler and search entire parameter space between
                       provided limits in intervals set by n_trials (for 1 parameter optimizations), or
                       by the square root of n_trials (for 2 parameter optimizations), by default False
   :type grid_search: bool, optional
   :param fname: file name to save both study and inversion results to as pickle files, by
                 default fname is `tmp_x_zref_density_cv` where x is a random integer between 0
                 and 999 and will save study to <fname>_study.pickle and tuple of inversion
                 results to <fname>_results.pickle.
   :type fname: str | None, optional
   :param plot_cv: plot the cross-validation results, by default True
   :type plot_cv: bool, optional
   :param logx: use a log scale for the cross-validation plot x-axis, by default False
   :type logx: bool, optional
   :param logy: use a log scale for the cross-validation plot y-axis, by default False
   :type logy: bool, optional
   :param progressbar: add a progressbar, by default True
   :type progressbar: bool, optional
   :param parallel: run the optimization in parallel, by default False
   :type parallel: bool, optional
   :param fold_progressbar: show a progress bar for each fold of the constraint-point minimization
                            cross-validation, by default True
   :type fold_progressbar: bool, optional
   :param seed: random seed for the samplers, by default 0
   :type seed: int, optional

   :returns: * **study** (*optuna.study*) -- the completed optuna study
             * **final_inversion_results** (*tuple[pandas.DataFrame, pandas.DataFrame, dict[str,             typing.Any], float]*) -- a tuple of the inversion results: topography dataframe, gravity dataframe,
               parameter values and elapsed time.


.. py:function:: optimize_inversion_zref_density_contrast_kfolds(constraints_df, split_kwargs = None, **kwargs)

   Perform an optimization for zref and density contrast values same as
   function `optimize_inversion_zref_density_contrast`, but pass a dataframe of
   constraint points and `split_kwargs` which are both passed `split_test_train` create
   K-folds of testing and training constraints. For each set of zref/density values,
   regional separation and inversion are performed for each of the K-folds in the
   constraints dataframe. The score for each parameter set will be the mean of the
   K-folds scores.
   This then repeats for all parameters. Within each parameter set and fold, the
   training constraints are used for the regional separation and the testing
   constraints are used for scoring. This optimization performs a total number of
   inversions equal to  K-folds * number of parameter sets. For 20 parameter sets and 5
   K-folds, this is 100 inversions. This extra computational expense is only useful if
   the regional separation technique you supply via `regional_grav_kwargs` uses
   constraints points for the estimations, such as constraint point minimization
   (method='constraints_cv' or method='constraints'). It is more
   efficient, but less accurate, to simple use a different regional estimation
   technique, which doesn't require constraint points, to find the optimal zref and
   density values. Then use these again in another inversion with the desired regional
   separation technique. Using the regional method of "constraints" will simply use the
   training points and supplied `grid_method` parameter values to calculate a regional
   field. Using the regional method of "constraints_cv" will take the training points
   and split these into a secondary set of training and testing points. These will be
   used internally in the regional separation to find the optimal `grid_method`
   parameters.

   :param constraints_df: constraints dataframe with columns "easting", "northing", and "upward".
   :param split_kwargs: kwargs to be passed to `split_test_train` for splitting constraints_df into
                        test and train sets, by default None
   :type split_kwargs: dict[str, typing.Any] | None, optional
   :param \*\*kwargs: kwargs to be passed to `optimize_inversion_zref_density_contrast`
   :type \*\*kwargs: typing.Any

   :returns: * **study** (*optuna.study*) -- the completed optuna study
             * **inversion_results** (*tuple[pandas.DataFrame, pandas.DataFrame, dict[str,             typing.Any], float]]*) -- tuple of the best inversion results.


.. py:class:: OptimalEqSourceParams(depth_limits = None, block_size_limits = None, damping_limits = None, **kwargs)

   Objective function to use in an Optuna optimization for finding the optimal
   equivalent source parameters for fitting to gravity data.


   .. py:attribute:: depth_limits
      :value: None


   .. py:attribute:: block_size_limits
      :value: None


   .. py:attribute:: damping_limits
      :value: None


   .. py:attribute:: kwargs


   .. py:method:: __call__(trial)

      :param trial: the trial to run
      :type trial: optuna.trial

      :returns: the score of the eq_sources fit
      :rtype: float


.. py:function:: optimize_eq_source_params(coordinates, data, n_trials = 100, damping_limits = None, depth_limits = None, block_size_limits = None, sampler = None, plot = False, progressbar = True, parallel = False, fname = None, seed = 0, **kwargs)

   Use Optuna to find the optimal parameters for fitting equivalent sources to gravity
   data. The 3 parameters are damping, depth, and block size. Any or all of these can
   be optimized at the same time. Provide upper and lower limits for each parameter,
   or if you don't want to optimize a parameter, provide a constant value of the
   parameter in the kwargs.

   :param coordinates: tuple of coordinates in the order (easting, northing, upward) for the gravity
                       observation locations.
   :type coordinates: tuple[pandas.Series | numpy.ndarray, pandas.Series | numpy.ndarray,             pandas.Series | numpy.ndarray]
   :param data: gravity data values
   :type data: pandas.Series | numpy.ndarray
   :param n_trials: number of trials to run, by default 100
   :type n_trials: int, optional
   :param damping_limits: damping parameter limits, by default (0, 10**3)
   :type damping_limits: tuple[float, float], optional
   :param depth_limits: source depth limits (positive downwards) in meters, by default (0, 10e6)
   :type depth_limits: tuple[float, float], optional
   :param block_size_limits: block size limits in meters, by default None
   :type block_size_limits: tuple[float, float] | None, optional
   :param sampler: specify which Optuna sampler to use, by default GPsampler
   :type sampler: optuna.samplers.BaseSampler | None, optional
   :param plot: plot the resulting optimization figures, by default False
   :type plot: bool, optional
   :param progressbar: add a progressbar, by default True
   :type progressbar: bool, optional
   :param parallel: run the optimization in parallel, by default False
   :type parallel: bool, optional
   :param fname: file name to save the study to, by default None
   :type fname: str | None, optional
   :param seed: random seed for the samplers, by default 0
   :type seed: int, optional
   :param kwargs: additional keyword arguments to pass to `OptimalEqSourceParams`, which are
                  passed to `eq_sources_score`. These can include parameters to pass to
                  `harmonica.EquivalentSources`; "damping", "points", "depth", "block_size",
                  "parallel", and "dtype", or parameters to pass to `vd.cross_val_score`;
                  "delayed", or "weights".
   :type kwargs: typing.Any

   :returns: * **study** (*optuna.study*) -- the completed optuna study
             * **eqs** (*harmonica.EquivalentSources*) -- the fitted equivalent sources model


.. py:class:: OptimizeRegionalTrend(trend_limits, optimize_on_true_regional_misfit = False, separate_metrics = True, **kwargs)

   Objective function to use in an Optuna optimization for finding the optimal trend
   order for estimation the regional component of gravity misfit.


   .. py:attribute:: trend_limits


   .. py:attribute:: optimize_on_true_regional_misfit
      :value: False


   .. py:attribute:: separate_metrics
      :value: True


   .. py:attribute:: kwargs


   .. py:method:: __call__(trial)

      :param trial: the trial to run
      :type trial: optuna.trial

      :returns: the scores
      :rtype: float


.. py:class:: OptimizeRegionalFilter(filter_width_limits, optimize_on_true_regional_misfit = False, separate_metrics = True, **kwargs)

   Objective function to use in an Optuna optimization for finding the optimal filter
   width for estimation the regional component of gravity misfit.


   .. py:attribute:: filter_width_limits


   .. py:attribute:: optimize_on_true_regional_misfit
      :value: False


   .. py:attribute:: separate_metrics
      :value: True


   .. py:attribute:: kwargs


   .. py:method:: __call__(trial)

      :param trial: the trial to run
      :type trial: optuna.trial

      :returns: the scores
      :rtype: float


.. py:class:: OptimizeRegionalEqSources(depth_limits = None, block_size_limits = None, damping_limits = None, grav_obs_height_limits = None, optimize_on_true_regional_misfit = False, separate_metrics = True, **kwargs)

   Objective function to use in an Optuna optimization for finding the optimal
   equivalent source parameters for estimation the regional component of gravity
   misfit.


   .. py:attribute:: depth_limits
      :value: None


   .. py:attribute:: block_size_limits
      :value: None


   .. py:attribute:: damping_limits
      :value: None


   .. py:attribute:: grav_obs_height_limits
      :value: None


   .. py:attribute:: optimize_on_true_regional_misfit
      :value: False


   .. py:attribute:: separate_metrics
      :value: True


   .. py:attribute:: kwargs


   .. py:method:: __call__(trial)

      :param trial: the trial to run
      :type trial: optuna.trial

      :returns: the scores
      :rtype: float


.. py:class:: OptimizeRegionalConstraintsPointMinimization(training_df, testing_df, grid_method, tension_factor_limits = (0, 1), spline_damping_limits = None, depth_limits = None, block_size_limits = None, damping_limits = None, grav_obs_height_limits = None, optimize_on_true_regional_misfit = False, separate_metrics = True, progressbar = False, **kwargs)

   Objective function to use in an Optuna optimization for finding the optimal
   hyperparameter values the Constraint Point Minimization technique for estimation the
   regional component of gravity misfit. If single dataframes are supplied to
   `training_df` and `testing_df`, for each parameter value a regional field will be
   estimated using the `training_df`, and a score calculated used the `testing_df`. If
   lists of dataframes are supplied, a score will be calculated for each item in the
   list and the mean of the scores will be the metric returned. This class is used with
   the function `optimize_regional_constraint_point_minimization`.


   .. py:attribute:: training_df


   .. py:attribute:: testing_df


   .. py:attribute:: grid_method


   .. py:attribute:: tension_factor_limits
      :value: (0, 1)


   .. py:attribute:: spline_damping_limits
      :value: None


   .. py:attribute:: depth_limits
      :value: None


   .. py:attribute:: block_size_limits
      :value: None


   .. py:attribute:: damping_limits
      :value: None


   .. py:attribute:: grav_obs_height_limits
      :value: None


   .. py:attribute:: optimize_on_true_regional_misfit
      :value: False


   .. py:attribute:: separate_metrics
      :value: True


   .. py:attribute:: progressbar
      :value: False


   .. py:attribute:: kwargs


   .. py:method:: __call__(trial)

      :param trial: the trial to run
      :type trial: optuna.trial

      :returns: the scores
      :rtype: float


.. py:function:: optimize_regional_filter(testing_df, grav_df, filter_width_limits, score_as_median = False, remove_starting_grav_mean = False, true_regional = None, n_trials = 100, sampler = None, plot = False, plot_grid = False, optimize_on_true_regional_misfit = False, separate_metrics = True, progressbar = True, parallel = False, fname = None, seed = 0)

   Run an Optuna optimization to find the optimal filter width for estimating the
   regional component of gravity misfit. For synthetic testing, if the true regional
   grid is provided, the optimization can be set to optimize on the RMSE of the
   predicted and true regional gravity, by setting
   `optimize_on_true_regional_misfit=True`. By default this will perform a
   multi-objective optimization to find the best trade-off between the lowest RMSE of
   the residual at the constraints and the highest RMSE of the residual at all
   locations.

   :param testing_df: constraint points to use for calculating the score with columns "easting",
                      "northing" and "upward".
   :type testing_df: pandas.DataFrame
   :param grav_df: gravity dataframe with columns "easting", "northing", "reg", and
                   `gravity_anomaly`.
   :type grav_df: pandas.DataFrame
   :param filter_width_limits: limits to use for the filter width in meters.
   :type filter_width_limits: tuple[float, float]
   :param score_as_median: use the root median square instead of the root mean square for the scoring
                           metric, by default False
   :type score_as_median: bool, optional
   :param remove_starting_grav_mean: remove the mean of the starting gravity data before estimating the regional.
                                     Useful to mitigate effects of poorly-chosen zref value. By default False
   :type remove_starting_grav_mean: bool, optional
   :param true_regional: if the true regional gravity is known (in synthetic models), supply this as a
                         grid to include a user_attr of the RMSE between this and the estimated regional
                         for each trial, or set `optimize_on_true_regional_misfit=True` to have the
                         optimization optimize on the RMSE, by default None
   :type true_regional: xarray.DataArray | None, optional
   :param n_trials: number of trials to run, by default 100
   :type n_trials: int, optional
   :param sampler: customize the optuna sampler, by default TPE sampler
   :type sampler: optuna.samplers.BaseSampler | None, optional
   :param plot: plot the resulting optimization figures, by default False
   :type plot: bool, optional
   :param plot_grid: plot the resulting regional gravity grid, by default False
   :type plot_grid: bool, optional
   :param optimize_on_true_regional_misfit: if true_regional grid is provide, choose to perform optimization on the RMSE
                                            between the true regional and the estimated region, by default False
   :type optimize_on_true_regional_misfit: bool, optional
   :param separate_metrics: if False, returns the scores combined with the formula
                            residual_constraints_score / residual_amplitude_score, by default is True and
                            returns both the residual and regional scores separately.
   :type separate_metrics: bool, optional
   :param progressbar: add a progressbar, by default True
   :type progressbar: bool, optional
   :param parallel: run the optimization in parallel, by default False
   :type parallel: bool, optional
   :param fname: file name to save the study to, by default None
   :type fname: str | None, optional
   :param seed: random seed for the samplers, by default 0
   :type seed: int, optional

   :returns: * **study** (*optuna.study,*) -- the completed Optuna study
             * **resulting_grav_df** (*pandas.DataFrame*) -- the resulting gravity dataframe of the best trial
             * **best_trial** (*optuna.trial.FrozenTrial*) -- the best trial


.. py:function:: optimize_regional_trend(testing_df, grav_df, trend_limits, score_as_median = False, remove_starting_grav_mean = False, true_regional = None, sampler = None, plot = False, plot_grid = False, optimize_on_true_regional_misfit = False, separate_metrics = True, progressbar = True, parallel = False, fname = None, seed = 0)

   Run an Optuna optimization to find the optimal trend order for estimating the
   regional component of gravity misfit. For synthetic testing, if the true regional
   grid is provided, the optimization can be set to optimize on the RMSE of the
   predicted and true regional gravity, by setting
   `optimize_on_true_regional_misfit=True`. By default this will perform a
   multi-objective optimization to find the best trade-off between the lowest RMSE of
   the residual at the constraints and the highest RMSE of the residual at all
   locations.

   :param testing_df: constraint points to use for calculating the score with columns "easting",
                      "northing" and "upward".
   :type testing_df: pandas.DataFrame
   :param grav_df: gravity dataframe with columns "easting", "northing", "reg" and
                   `gravity_anomaly`.
   :type grav_df: pandas.DataFrame
   :param trend_limits: limits to use for the trend order in degrees.
   :type trend_limits: tuple[int, int]
   :param score_as_median: use the root median square instead of the root mean square for the scoring
                           metric, by default False
   :type score_as_median: bool, optional
   :param remove_starting_grav_mean: remove the mean of the starting gravity data before estimating the regional.
                                     Useful to mitigate effects of poorly-chosen zref value. By default False
   :type remove_starting_grav_mean: bool, optional
   :param true_regional: if the true regional gravity is known (in synthetic models), supply this as a
                         grid to include a user_attr of the RMSE between this and the estimated regional
                         for each trial, or set `optimize_on_true_regional_misfit=True` to have the
                         optimization optimize on the RMSE, by default None
   :type true_regional: xarray.DataArray | None, optional
   :param sampler: customize the optuna sampler, by default GridSampler
   :type sampler: optuna.samplers.BaseSampler | None, optional
   :param plot: plot the resulting optimization figures, by default False
   :type plot: bool, optional
   :param plot_grid: plot the resulting regional gravity grid, by default False
   :type plot_grid: bool, optional
   :param optimize_on_true_regional_misfit: if true_regional grid is provide, choose to perform optimization on the RMSE
                                            between the true regional and the estimated region, by default False
   :type optimize_on_true_regional_misfit: bool, optional
   :param separate_metrics: if False, returns the scores combined with the formula
                            residual_constraints_score / residual_amplitude_score, by default is True and
                            returns both the residual and regional scores separately.
   :type separate_metrics: bool, optional
   :param progressbar: add a progressbar, by default True
   :type progressbar: bool, optional
   :param parallel: run the optimization in parallel, by default False
   :type parallel: bool, optional
   :param fname: file name to save the study to, by default None
   :type fname: str | None, optional
   :param seed: random seed for the samplers, by default 0
   :type seed: int, optional

   :returns: * **study** (*optuna.study,*) -- the completed Optuna study
             * **resulting_grav_df** (*pandas.DataFrame*) -- the resulting gravity dataframe of the best trial
             * **best_trial** (*optuna.trial.FrozenTrial*) -- the best trial


.. py:function:: optimize_regional_eq_sources(testing_df, grav_df, score_as_median = False, true_regional = None, n_trials = 100, depth_limits = None, block_size_limits = None, damping_limits = None, grav_obs_height_limits = None, sampler = None, plot = False, plot_grid = False, optimize_on_true_regional_misfit = False, separate_metrics = True, progressbar = True, parallel = False, fname = None, seed = 0, **kwargs)

   Run an Optuna optimization to find the optimal equivalent source parameters for
   estimating the regional component of gravity misfit. For synthetic testing, if the
   true regional grid is provided, the optimization can be set to optimize on the
   RMSE of the predicted and true regional gravity, by setting
   `optimize_on_true_regional_misfit=True`. By default this will perform a
   multi-objective optimization to find the best trade-off between the lowest RMSE of
   the residual at the constraints and the highest RMSE of the residual at all
   locations.

   :param testing_df: constraint points to use for calculating the score with columns "easting",
                      "northing" and "upward".
   :type testing_df: pandas.DataFrame
   :param grav_df: gravity dataframe with columns "easting", "northing", "reg", and
                   `gravity_anomaly`.
   :type grav_df: pandas.DataFrame
   :param score_as_median: use the root median square instead of the root mean square for the scoring
                           metric, by default False
   :type score_as_median: bool, optional
   :param true_regional: if the true regional gravity is known (in synthetic models), supply this as a
                         grid to include a user_attr of the RMSE between this and the estimated regional
                         for each trial, or set `optimize_on_true_regional_misfit=True` to have the
                         optimization optimize on the RMSE, by default None
   :type true_regional: xarray.DataArray | None, optional
   :param n_trials: number of trials to run, by default 100
   :type n_trials: int, optional
   :param depth_limits: limits to use for source depths, positive down in meters, by default None
   :type depth_limits: tuple[float, float] | None, optional
   :param block_size_limits: limits to use for block size in meters, by default None
   :type block_size_limits: tuple[float, float] | None, optional
   :param damping_limits: limits to use for the damping parameter, by default None
   :type damping_limits: tuple[float, float] | None, optional
   :param grav_obs_height_limits: limits to use for the gravity observation height in meters, by default None
   :type grav_obs_height_limits: tuple[float, float] | None, optional
   :param sampler: customize the optuna sampler, by default TPE sampler
   :type sampler: optuna.samplers.BaseSampler | None, optional
   :param plot: plot the resulting optimization figures, by default False
   :type plot: bool, optional
   :param plot_grid: plot the resulting regional gravity grid, by default False
   :type plot_grid: bool, optional
   :param optimize_on_true_regional_misfit: if true_regional grid is provide, choose to perform optimization on the RMSE
                                            between the true regional and the estimated region, by default False
   :type optimize_on_true_regional_misfit: bool, optional
   :param separate_metrics: if False, returns the scores combined with the formula
                            residual_constraints_score / residual_amplitude_score, by default is True and
                            returns both the residual and regional scores separately.
   :type separate_metrics: bool, optional
   :param progressbar: add a progressbar, by default True
   :type progressbar: bool, optional
   :param parallel: run the optimization in parallel, by default False
   :type parallel: bool, optional
   :param fname: file name to save the study to, by default None
   :type fname: str | None, optional
   :param seed: random seed for the samplers, by default 0
   :type seed: int, optional
   :param kwargs: additional keyword arguments to pass to the regional.regional_separation
   :type kwargs: typing.Any

   :returns: * **study** (*optuna.study,*) -- the completed Optuna study
             * **resulting_grav_df** (*pandas.DataFrame*) -- the resulting gravity dataframe of the best trial
             * **best_trial** (*optuna.trial.FrozenTrial*) -- the best trial


.. py:function:: optimize_regional_constraint_point_minimization(testing_training_df, grid_method, grav_df, n_trials, tension_factor_limits = (0, 1), spline_damping_limits = None, depth_limits = None, block_size_limits = None, damping_limits = None, grav_obs_height_limits = None, sampler = None, plot = False, plot_grid = False, fold_progressbar = False, optimize_on_true_regional_misfit = False, separate_metrics = True, score_as_median = False, true_regional = None, progressbar = True, parallel = False, fname = None, seed = 0, **kwargs)

   Run an Optuna optimization to find the optimal hyperparameters for the Constraint
   Point Minimization (CPM) technique for estimating the regional component of gravity
   misfit. Since constraints are used both for determining the regional field, and for
   the scoring of the performance, we must split the constraints into testing and
   training sets. This function can perform both single and K-Folds cross validations,
   determined by the number of "fold_x" columns in testing_training_df. If using more
   than one fold, the score for each parameter set is the mean of the scores of each
   fold. The total number of regional separation this will perform is n_trials*K-folds.
   This function then uses the optimal parameter values to redo the regional
   estimation using all the constraints points, not just the training points, and
   returns the results.
   By default this will perform a multi-objective optimization to
   find the best trade-off between the lowest RMSE of the residual misfit at the
   constraints and the highest RMS amplitude of the residual at all locations.
   Choose which CPM gridding method with the `grid_method` parameter, and supplied the
   associated method parameter limits via parameters <parameter>_limits. For grid
   method "eq_sources" which has multiple parameters, if limits aren't provided for one
   of the parameters, supply a constant value for the parameter in the keyword
   arguments, which are past direction to `regional.regional_separation`.
   For synthetic testing, if the true regional grid is provided, the optimization can
   be set to optimize on the RMSE of the predicted and true regional gravity, by
   setting `optimize_on_true_regional_misfit=True`.

   :param testing_training_df: constraints dataframe with columns "easting", "northing", "upward", and a column
                               for each fold in the format "fold_0", "fold_1", etc. This can be created with
                               function `cross_validation.split_test_train()`. Each fold column should have
                               strings of "test" or "train" to indicate which rows are testing or training
                               points. If more than one fold is provided, this function will perform a K-Folds
                               cross validation and the score for each set of parameters will be the mean of
                               the K-scores.
   :type testing_training_df: pandas.DataFrame
   :param grid_method: constraint point minimization method to use, choose between "verde" for
                       bi-harmonic spline gridding, "pygmt" for tensioned minimum curvature gridding,
                       or "eq_sources" for equivalent sources gridding.
   :type grid_method: str
   :param grav_df: gravity dataframe with columns "easting", "northing", "reg", and
                   "gravity_anomaly".
   :type grav_df: pandas.DataFrame
   :param n_trials: number of trials to run
   :type n_trials: int
   :param tension_factor_limits: limits to use for the PyGMT tension factor gridding, by default (0, 1)
   :type tension_factor_limits: tuple[float, float], optional
   :param spline_damping_limits: limits to use for the Verde bi-harmonic spline damping, by default None
   :type spline_damping_limits: tuple[float, float] | None, optional
   :param depth_limits: limits to use for the equivalent sources' depths, by default None
   :type depth_limits: tuple[float, float] | None, optional
   :param block_size_limits: limits to use for the block size for fitting equivalent sources, by default None
   :type block_size_limits: tuple[float, float] | None, optional
   :param damping_limits: limits to use for the damping value for fitting equivalent sources, by default
                          None
   :type damping_limits: tuple[float, float] | None, optional
   :param grav_obs_height_limits: limits to use for the gravity observation height for fitting equivalent sources,
                                  by default None
   :type grav_obs_height_limits: tuple[float, float] | None, optional
   :param sampler: customize the optuna sampler, by default TPE sampler
   :type sampler: optuna.samplers.BaseSampler | None, optional
   :param plot: plot the resulting optimization figures, by default False
   :type plot: bool, optional
   :param plot_grid: plot the resulting regional gravity grid, by default False
   :type plot_grid: bool, optional
   :param fold_progressbar: turn on or off a progress bar for the optimization of each fold if performing
                            a K-Folds cross-validation within the optimization, by default False
   :type fold_progressbar: bool, optional
   :param optimize_on_true_regional_misfit: if true_regional grid is provide, choose to perform optimization on the RMSE
                                            between the true regional and the estimated region, by default False
   :type optimize_on_true_regional_misfit: bool, optional
   :param separate_metrics: if False, returns the scores combined with the formula
                            residual_constraints_score / residual_amplitude_score, by default is True and
                            returns both the residual and regional scores separately.
   :type separate_metrics: bool, optional
   :param score_as_median: use the root median square instead of the root mean square for the scoring
                           metric, by default False
   :type score_as_median: bool, optional
   :param true_regional: if the true regional gravity is known (in synthetic models), supply this as a
                         grid to include a user_attr of the RMSE between this and the estimated regional
                         for each trial, or set `optimize_on_true_regional_misfit=True` to have the
                         optimization optimize on the RMSE, by default None
   :type true_regional: xarray.DataArray | None, optional
   :param progressbar: add a progressbar, by default True
   :type progressbar: bool, optional
   :param parallel: run the optimization in parallel, by default False
   :type parallel: bool, optional
   :param fname: file name to save the study to, by default None
   :type fname: str | None, optional
   :param seed: random seed for the samplers, by default 0
   :type seed: int, optional
   :param kwargs: additional keyword arguments to pass to the regional.regional_separation
   :type kwargs: typing.Any

   :returns: * **study** (*optuna.study,*) -- the completed Optuna study
             * **resulting_grav_df** (*pandas.DataFrame*) -- the resulting gravity dataframe of the best trial
             * **best_trial** (*optuna.trial.FrozenTrial*) -- the best trial


.. py:function:: optimal_buffer(target, buffer_perc_limits = (1, 50), n_trials = 25, sampler = None, grid_search = False, fname = None, progressbar = True, parallel = False, plot = True, seed = 0, **kwargs)

   Run an optimization to find best buffer zone width.


.. py:class:: OptimalBuffer(buffer_perc_limits, target, fname, **kwargs)

   Objective function to use in an Optuna optimization for finding the buffer zone
   width as a percentage of region width which limits the gravity decay (edge effects)
   to a specified amount within a region of interest. Used within function
   func:`optimal_buffer`.


   .. py:attribute:: fname


   .. py:attribute:: buffer_perc_limits


   .. py:attribute:: target


   .. py:attribute:: kwargs


   .. py:method:: __call__(trial)

      :param trial: the trial to run
      :type trial: optuna.trial

      :returns: the score of the eq_sources fit
      :rtype: float


.. py:class:: DuplicateIterationPruner

   Bases: :py:obj:`optuna.pruners.BasePruner`


   DuplicatePruner

   Pruner to detect duplicate trials based on the parameters.

   This pruner is used to identify and prune trials that have the same set of
   parameters as a previously completed trial.


   .. py:method:: prune(study, trial)

      Judge whether the trial should be pruned based on the reported values.

      Note that this method is not supposed to be called by library users. Instead,
      :func:`optuna.trial.Trial.report` and :func:`optuna.trial.Trial.should_prune` provide
      user interfaces to implement pruning mechanism in an objective function.

      :param study: Study object of the target study.
      :param trial: FrozenTrial object of the target trial.
                    Take a copy before modifying this object.

      :returns: A boolean value representing whether the trial should be pruned.