invert4geom.utils
=================

.. py:module:: invert4geom.utils


Classes
-------

.. autoapisummary::

   invert4geom.utils.DuplicateFilter


Functions
---------

.. autoapisummary::

   invert4geom.utils._log_level
   invert4geom.utils.environ
   invert4geom.utils._check_constraints_inside_gravity_region
   invert4geom.utils._check_gravity_inside_topography_region
   invert4geom.utils.rmse
   invert4geom.utils.nearest_grid_fill
   invert4geom.utils.filter_grid
   invert4geom.utils.dist_nearest_points
   invert4geom.utils.normalize
   invert4geom.utils.normalize_xarray
   invert4geom.utils.scale_normalized
   invert4geom.utils.normalized_mindist
   invert4geom.utils.sample_grids
   invert4geom.utils.extract_prism_data
   invert4geom.utils.get_spacing
   invert4geom.utils.sample_bounding_surfaces
   invert4geom.utils.enforce_confining_surface
   invert4geom.utils.apply_surface_correction
   invert4geom.utils.update_prisms_ds
   invert4geom.utils.add_updated_prism_properties
   invert4geom.utils.create_topography
   invert4geom.utils.grids_to_prisms
   invert4geom.utils.best_spline_cv
   invert4geom.utils.best_equivalent_source_damping
   invert4geom.utils.eq_sources_score
   invert4geom.utils.gravity_decay_buffer


Module Contents
---------------

.. py:function:: _log_level(level)

   Run body with logger at a different level


.. py:function:: environ(**env)

   temporarily set/reset an environment variable


.. py:class:: DuplicateFilter(log)

   Filters away duplicate log messages.
   Adapted from https://stackoverflow.com/a/60462619/18686384


   .. py:attribute:: msgs


   .. py:attribute:: log


   .. py:method:: filter(record)


   .. py:method:: __enter__()


   .. py:method:: __exit__(exc_type, exc_val, exc_tb)


.. py:function:: _check_constraints_inside_gravity_region(constraints_df, grav_df)

   check that all constraints are inside the region of the gravity data


.. py:function:: _check_gravity_inside_topography_region(grav_df, topography)

   check that all gravity data is inside the region of the topography grid


.. py:function:: rmse(data, as_median = False)

   function to give the root mean/median squared error (RMSE) of data

   :param data: input data
   :type data: numpy.ndarray
   :param as_median: choose to give root median squared error instead, by default False
   :type as_median: bool, optional

   :returns: RMSE value
   :rtype: float


.. py:function:: nearest_grid_fill(grid, method = 'verde', crs = None)

   fill missing values in a grid with the nearest value.

   :param grid: grid with missing values
   :type grid: xarray.DataArray
   :param method: choose method of filling, by default "verde"
   :type method: str, optional
   :param crs: if method is 'rioxarray', provide the crs of the grid, in format 'epsg:xxxx',
               by default None
   :type crs: str | None, optional

   :returns: filled grid
   :rtype: xarray.DataArray


.. py:function:: filter_grid(grid, filter_width = None, height_displacement = None, filt_type = 'lowpass', pad_width_factor = 3, pad_mode = 'linear_ramp', pad_constant = None, pad_end_values = None)

   Apply a spatial filter to a grid.

   :param grid: grid to filter the values of
   :type grid: xarray.DataArray
   :param filter_width: width of the filter in meters, by default None
   :type filter_width: float, optional
   :param height_displacement: height displacement for upward continuation, relative to observation height, by
                               default None
   :type height_displacement: float, optional
   :param filt_type: type of filter to use from 'lowpass', 'highpass' 'up_deriv', 'easting_deriv',
                     'northing_deriv', 'up_continue', or 'total_gradient', by default "lowpass"
   :type filt_type: str, optional
   :param pad_width_factor: factor of grid width to pad the grid by, by default 3, which equates to a pad
                            with a width of 1/3 of the grid width.
   :type pad_width_factor: int, optional
   :param pad_mode: mode of padding, can be "linear", by default "linear_ramp"
   :type pad_mode: str, optional
   :param pad_constant: constant value to use for padding, by default None
   :type pad_constant: float | None, optional
   :param pad_end_values: value to use for end of padding if pad_mode is "linear_ramp", by default None
   :type pad_end_values: float | None, optional

   :returns: a filtered grid
   :rtype: xarray.DataArray


.. py:function:: dist_nearest_points(targets, data, coord_names = None)

   for all grid cells calculate to the distance to the nearest target.

   :param targets: contains the coordinates of the targets
   :type targets: pandas.DataFrame
   :param data: the grid data, in either gridded or tabular form
   :type data: pandas.DataFrame | xarray.DataArray | xarray.Dataset
   :param coord_names: the names of the coordinates for both the targets and the data, by default None
   :type coord_names: tuple[str, str] | None, optional

   :returns: the distance to the nearest target for each gridcell, in the same format as the
             input for `data`.
   :rtype: typing.Any


.. py:function:: normalize(x, low = 0, high = 1)

   Normalize a list of numbers between provided values

   :param x: numbers to normalize
   :type x: NDArray
   :param low: lower value for normalization, by default 0
   :type low: float, optional
   :param high: higher value for normalization, by default 1
   :type high: float, optional

   :returns: a normalized list of numbers
   :rtype: NDArray


.. py:function:: normalize_xarray(da, low = 0, high = 1)

   Normalize a grid between provided values

   :param da: grid to normalize
   :type da: xarray.DataArray
   :param low: lower value for normalization, by default 0
   :type low: float, optional
   :param high: higher value for normalization, by default 1
   :type high: float, optional

   :returns: a normalized grid
   :rtype: xarray.DataArray


.. py:function:: scale_normalized(sample, bounds)

   Rescales the sample space into the unit hypercube, bounds = [0,1]

   :param sample: sampled values
   :type sample: NDArray
   :param bounds: bounds of the sampling
   :type bounds: NDArray

   :returns: sampled values normalized from 0 to 1
   :rtype: NDArray


.. py:function:: normalized_mindist(points, grid, low = None, high = None, mindist = None, region = None)

   Find the minimum distance between each grid cell and the nearest point. If low and
   high are provided, normalize the min dists grid between these values. If region is
   provided, all grid cells outside region are set to a distance of 0.

   :param points: coordinates of the points
   :type points: pandas.DataFrame
   :param grid: gridded data to find min dists for each grid cell
   :type grid: xarray.DataArray
   :param low: lower value for normalization, by default None
   :type low: float | None, optional
   :param high: higher value for normalization, by default None
   :type high: float | None, optional
   :param mindist: the minimum allowed distance, all values below are set equal to, by default None
   :type mindist: float | None, optional
   :param region: bounding region for which all grid cells outside will be set to low, by default
                  None
   :type region: list[float] | None, optional

   :returns: grid of normalized minimum distances
   :rtype: xarray.DataArray


.. py:function:: sample_grids(df, grid, sampled_name, **kwargs)

   Sample data at every point along a line

   :param df: Dataframe containing columns 'x', 'y', or columns with names defined by kwarg
              "coord_names".
   :type df: pandas.DataFrame
   :param grid: Grid to sample, either file name or xarray.DataArray
   :type grid: str or xarray.DataArray
   :param sampled_name: Name for sampled column
   :type sampled_name: str,

   :returns: Dataframe with new column (sampled_name) of sample values from (grid)
   :rtype: pandas.DataFrame


.. py:function:: extract_prism_data(prism_layer)

   extract the grid spacing from the starting prism layer and adds variables 'topo' and
   'starting_topo', which are the both the starting topography elevation.
   'starting_topo' remains unchanged, while 'topo' is updated at each iteration.

   :param prism_layer: starting model prism layer
   :type prism_layer: xarray.Dataset

   :returns: * **prisms_df** (*pandas.DataFrame*) -- dataframe of prism layer
             * **prisms_ds** (*xarray.Dataset*) -- prism layer with added variables 'topo' and 'starting_topo'
             * **spacing** (*float*) -- spacing of prisms
             * **topo_grid** (*xarray.DataArray*) -- grid of starting topography


.. py:function:: get_spacing(prisms_df)

   Extract spacing of harmonica prism layer using a dataframe representation.

   :param prisms_df: dataframe of harmonica prism layer
   :type prisms_df: pandas.DataFrame

   :returns: spacing of prisms
   :rtype: float


.. py:function:: sample_bounding_surfaces(prisms_df, upper_confining_layer = None, lower_confining_layer = None)

   sample upper and/or lower confining layers into prisms dataframe

   :param prisms_df: dataframe of prism properties
   :type prisms_df: pandas.DataFrame
   :param upper_confining_layer: layer which the inverted topography should always be below, by default None
   :type upper_confining_layer: xarray.DataArray | None, optional
   :param lower_confining_layer: layer which the inverted topography should always be above, by default None
   :type lower_confining_layer: xarray.DataArray | None, optional

   :returns: a dataframe with added columns 'upper_bounds' and 'lower_bounds', which are the
             sampled values of the supplied confining grids.
   :rtype: pandas.DataFrame


.. py:function:: enforce_confining_surface(prisms_df, iteration_number)

   alter the surface correction values to ensure when added to the current iteration's
   topography it doesn't intersect optional confining layers.

   :param prisms_df: prism layer dataframe with optional 'upper_bounds' or 'lower_bounds' columns,
                     and current iteration's topography.
   :type prisms_df: pandas.DataFrame
   :param iteration_number: number of the current iteration, starting at 1 not 0
   :type iteration_number: int

   :returns: a dataframe with added column 'iter_{iteration_number}_correction
   :rtype: pandas.DataFrame


.. py:function:: apply_surface_correction(prisms_df, iteration_number)

   update the prisms dataframe and dataset with the surface correction. Ensure that
   the updated surface doesn't intersect the optional confining surfaces.

   :param prisms_df: dataframe of prism properties
   :type prisms_df: pandas.DataFrame
   :param iteration_number: the iteration number, starting at 1 not 0
   :type iteration_number: int

   :returns: updated prisms dataframe and correction grid
   :rtype: tuple[pandas.DataFrame, xarray.DataArray]


.. py:function:: update_prisms_ds(prisms_ds, correction_grid)

   apply the corrections grid and update the prism tops, bottoms, topo, and
   densities.

   :param prisms_ds: harmonica prism layer
   :type prisms_ds: xarray.Dataset
   :param correction_grid: grid of corrections to apply to the prism layer
   :type correction_grid: xarray.DataArray

   :returns: updated prism layer with new tops, bottoms, topo, and densities
   :rtype: xarray.Dataset


.. py:function:: add_updated_prism_properties(prisms_df, prisms_ds, iteration_number)

   update the prisms dataframe the the new prism tops, bottoms, topo, and densities
   the iteration number, starting at 1 not 0

   :param prisms_df: dataframe of prism properties
   :type prisms_df: pandas.DataFrame
   :param prisms_ds: dataset of prism properties
   :type prisms_ds: xarray.Dataset
   :param iteration_number: the iteration number, starting at 1 not 0
   :type iteration_number: int

   :returns: updated prism dataframe with new tops, bottoms, topo, and densities
   :rtype: pandas.DataFrame


.. py:function:: create_topography(method, region, spacing, dampings = None, registration = 'g', upwards = None, constraints_df = None, weights = None, weights_col = None, upper_confining_layer = None, lower_confining_layer = None)

   Create a grid of topography data from either the interpolation of point data or
   creating a grid of constant value. Optionally, a subset of point data can be
   interpolated and then merged with an existing grid. The this, constraints_df must
   contain two additional columns of booleans, `inside` which is True for points inside
   the region of interest, and False otherwise, and `buffer` which is True for points
   within a buffer region around the region of interest, and False otherwise. Inside
   and Buffer points are used to interpolated the data, and then the interpolated data
   (without the buffer zone) is merged with the points outside the region of interest.

   :param method: method to use, either 'flat' or 'splines'
   :type method: str
   :param region: region of the grid
   :type region: tuple[float, float, float, float]
   :param spacing: spacing of the grid
   :type spacing: float
   :param dampings: damping values to use in spline cross validation for method "spline", by default
                    None
   :type dampings: list[float] | None, optional
   :param registration: choose between gridline "g" or pixel "p" registration, by default "g"
   :type registration: str, optional
   :param upwards: constant value to use for method "flat", by default None
   :type upwards: float | None, optional
   :param constraints_df: dataframe with column 'upwards' to use for method "splines", and optionally
                          columns 'inside' and 'buffer', by default None
   :type constraints_df: pandas.DataFrame | None, optional
   :param weights: weight to use for fitting the spline. Typically, this should be 1 over the data
                   uncertainty squared, by default None
   :type weights: pandas.Series | numpy.ndarray | None, optional
   :param weights_col: instead of passing the weights, pass the name of the column containing the
                       weights, by default None
   :type weights_col: str | None, optional
   :param upper_confining_layer: layer which the inverted topography should always be below, by default None
   :type upper_confining_layer: xarray.DataArray | None, optional
   :param lower_confining_layer: layer which the inverted topography should always be above, by default None
   :type lower_confining_layer: xarray.DataArray | None, optional

   :returns: a topography grid
   :rtype: xarray.DataArray


.. py:function:: grids_to_prisms(surface, reference, density, input_coord_names = ('easting', 'northing'))

   create a Harmonica layer of prisms with assigned densities.

   :param surface: data to use for prism surface
   :type surface: xarray.DataArray
   :param reference: data or constant to use for prism reference, if value is below surface, prism
                     will be inverted
   :type reference: float | xarray.DataArray
   :param density: data or constant to use for prism densities, should be in the form of a density
                   contrast across a surface (i.e. between air and rock).
   :type density: float | int | xarray.DataArray
   :param input_coord_names: names of the coordinates in the input dataarray, by default
                             ("easting", "northing")
   :type input_coord_names: tuple[str, str], optional

   :returns: a prisms layer with assigned densities
   :rtype: xarray.Dataset


.. py:function:: best_spline_cv(coordinates, data, weights = None, **kwargs)

   Find the best damping parameter for a verde.SplineCV() fit. All kwargs are passed to
   the verde.SplineCV class.

   :param coordinates: easting and northing coordinates of the data
   :type coordinates: tuple[pandas.Series  |  numpy.ndarray, pandas.Series  |              numpy.ndarray]
   :param data: data for fitting the spline to
   :type data: pandas.Series | numpy.ndarray
   :param weights: if not None, then the weights assigned to each data point. Typically, this
                   should be 1 over the data uncertainty squared, by default None
   :type weights: pandas.Series | numpy.ndarray | None, optional

   :keyword dampings: The positive damping regularization parameter. Controls how much smoothness is
                      imposed on the estimated forces. If None, no regularization is used, by default
                      None
   :kwtype dampings: float | None
   :keyword force_coords: The easting and northing coordinates of the point forces. If None (default),
                          then will be set to the data coordinates.
   :kwtype force_coords: bool
   :keyword cv: Any scikit-learn cross-validation generator. If not given, will use the
                default set by :func:`verde.cross_val_score`.
   :kwtype cv: None | cross-validation generator
   :keyword delayed: If True, will use :func:`dask.delayed.delayed` to dispatch computations and
                     allow :mod:`dask` to execute the grid search in parallel (see note
                     above).
   :kwtype delayed: bool
   :keyword scoring: The scoring function (or name of a function) used for cross-validation.
                     Must be known to scikit-learn. See the description of *scoring* in
                     :func:`sklearn.model_selection.cross_val_score` for details. If None,
                     will fall back to the :meth:`verde.Spline.score` method.
   :kwtype scoring: None | str | Callable

   :returns: the spline which best fits the data
   :rtype: verde.Spline


.. py:function:: best_equivalent_source_damping(coordinates, data, delayed = False, weights = None, **kwargs)

   Find the best damping parameter for a harmonica.EquivalentSource() fit. All kwargs
   are passed to the harmonica.EquivalentSource class.

   :param coordinates: tuple of easting, northing, and upward coordinates of the gravity data
   :type coordinates: tuple[pandas.Series | numpy.ndarray, pandas.Series | numpy.ndarray,             pandas.Series | numpy.ndarray]
   :param data: the gravity data
   :type data: pandas.Series | numpy.ndarray
   :param delayed: compute the scores in parallel if True, by default False
   :type delayed: bool, optional
   :param weights: optional weight values for each gravity data point, by default None
   :type weights: numpy.ndarray | None, optional

   :keyword damping: The positive damping regularization parameter. Controls how much
                     smoothness is imposed on the estimated coefficients.
                     If None, no regularization is used.
   :kwtype damping: float | None
   :keyword points: List containing the coordinates of the equivalent point sources.
                    Coordinates are assumed to be in the following order:
                    (``easting``, ``northing``, ``upward``).
                    If None, will place one point source below each observation point at
                    a fixed relative depth below the observation point.
                    Defaults to None.
   :kwtype points: list[numpy.ndarray] | None
   :keyword depth: Parameter used to control the depth at which the point sources will be
                   located.
                   If a value is provided, each source is located beneath each data point
                   (or block-averaged location) at a depth equal to its elevation minus
                   the ``depth`` value.
                   If set to ``"default"``, the depth of the sources will be estimated as
                   4.5 times the mean distance between first neighboring sources.
                   This parameter is ignored if *points* is specified.
                   Defaults to ``"default"``.
   :kwtype depth: float or str
   :keyword block_size: Size of the blocks used on block-averaged equivalent sources.
                        If a single value is passed, the blocks will have a square shape.
                        Alternatively, the dimensions of the blocks in the South-North and
                        West-East directions can be specified by passing a tuple.
                        If None, no block-averaging is applied.
                        This parameter is ignored if *points* are specified.
                        Default to None.
   :kwtype block_size: float | tuple[float, float] | None
   :keyword parallel: If True any predictions and Jacobian building is carried out in
                      parallel through Numba's ``jit.prange``, reducing the computation time.
                      If False, these tasks will be run on a single CPU. Default to True.
   :kwtype parallel: bool
   :keyword dtype: The desired data-type for the predictions and the Jacobian matrix.
                   Default to ``"float64"``.
   :kwtype dtype: str

   :returns: the best fitted equivalent sources
   :rtype: harmonica.EquivalentSources


.. py:function:: eq_sources_score(kwargs)

   deprecated function, use cross_validation.eq_sources_score instead.


.. py:function:: gravity_decay_buffer(buffer_perc, spacing, inner_region, top, zref, obs_height, density, amplitude = None, wavelength = None, checkerboard = False, as_density_contrast = False, plot = True, plot_profile = True, progressbar = False)

   For a given buffer zone width (as percentage of x or y range) and domain parameters,
   calculate the max percent decay of the gravity anomaly within the region of
   interest.

   :param buffer_perc: percentage of the widest dimension of inner_region to use as buffer zone
   :type buffer_perc: float
   :param spacing: spacing of the prism layer and gravity observation points
   :type spacing: float
   :param inner_region: region boundaries for the region of interest
   :type inner_region: tuple[float, float, float, float]
   :param top: height for the top of the prisms
   :type top: float
   :param zref: reference level for the prisms
   :type zref: float
   :param obs_height: gravity observation height
   :type obs_height: float
   :param density: density value for the prisms
   :type density: float
   :param amplitude: if using `checkerboard`, this is the amplitude of each undulation, by default
                     None
   :type amplitude: float | None, optional
   :param wavelength: if using `checkerboard`, this is the wavelength of each undulation, by default
                      None
   :type wavelength: float | None, optional
   :param checkerboard: use an undulating checkerboard for the topography instead of a flat surface, by
                        default False
   :type checkerboard: bool, optional
   :param as_density_contrast: discretize the topography as a density contrast, resulting in no edge effects,
                               by default False
   :type as_density_contrast: bool, optional
   :param plot: plot the results, by default True
   :type plot: bool, optional
   :param plot_profile: plot a profile across the prism layer, by default True
   :type plot_profile: bool, optional
   :param progressbar: show a progressbar for the forward gravity calculation, by default False
   :type progressbar: bool, optional

   :returns: * **max_decay** (*float*) -- the maximum percentage decay of the gravity anomaly within the region of
               interest
             * **buffer_width** (*float*) -- width of the buffer zone
             * **buffer_cells** (*int*) -- number of cells in the buffer zone
             * **grav_ds** (*xarray.Dataset*) -- dataset of the forward gravity calculations