invert4geom.utils#

Classes#

DuplicateFilter

Filters away duplicate log messages.

Functions#

_log_level(level)

Run body with logger at a different level

environ(**env)

temporarily set/reset an environment variable

_check_constraints_inside_gravity_region(...)

check that all constraints are inside the region of the gravity data

_check_gravity_inside_topography_region(grav_df, ...)

check that all gravity data is inside the region of the topography grid

rmse(data[, as_median])

function to give the root mean/median squared error (RMSE) of data

nearest_grid_fill(grid[, method, crs])

fill missing values in a grid with the nearest value.

filter_grid(grid[, filter_width, height_displacement, ...])

Apply a spatial filter to a grid.

dist_nearest_points(targets, data[, coord_names])

for all grid cells calculate to the distance to the nearest target.

normalize(x[, low, high])

Normalize a list of numbers between provided values

normalize_xarray(da[, low, high])

Normalize a grid between provided values

scale_normalized(sample, bounds)

Rescales the sample space into the unit hypercube, bounds = [0,1]

normalized_mindist(points, grid[, low, high, mindist, ...])

Find the minimum distance between each grid cell and the nearest point. If low and

sample_grids(df, grid, sampled_name, **kwargs)

Sample data at every point along a line

extract_prism_data(prism_layer)

extract the grid spacing from the starting prism layer and adds variables 'topo' and

get_spacing(prisms_df)

Extract spacing of harmonica prism layer using a dataframe representation.

sample_bounding_surfaces(prisms_df[, ...])

sample upper and/or lower confining layers into prisms dataframe

enforce_confining_surface(prisms_df, iteration_number)

alter the surface correction values to ensure when added to the current iteration's

apply_surface_correction(prisms_df, iteration_number)

update the prisms dataframe and dataset with the surface correction. Ensure that

update_prisms_ds(prisms_ds, correction_grid)

apply the corrections grid and update the prism tops, bottoms, topo, and

add_updated_prism_properties(prisms_df, prisms_ds, ...)

update the prisms dataframe the the new prism tops, bottoms, topo, and densities

create_topography(method, region, spacing[, dampings, ...])

Create a grid of topography data from either the interpolation of point data or

grids_to_prisms(surface, reference, density[, ...])

create a Harmonica layer of prisms with assigned densities.

best_spline_cv(coordinates, data[, weights])

Find the best damping parameter for a verde.SplineCV() fit. All kwargs are passed to

best_equivalent_source_damping(coordinates, data[, ...])

Find the best damping parameter for a harmonica.EquivalentSource() fit. All kwargs

eq_sources_score(kwargs)

deprecated function, use cross_validation.eq_sources_score instead.

gravity_decay_buffer(buffer_perc, spacing, ...[, ...])

For a given buffer zone width (as percentage of x or y range) and domain parameters,

Module Contents#

_log_level(level)[source]#

Run body with logger at a different level

environ(**env)[source]#

temporarily set/reset an environment variable

class DuplicateFilter(logger)[source]#

Filters away duplicate log messages. Adapted from https://stackoverflow.com/a/60462619/18686384

msgs[source]#
logger[source]#
filter(record)[source]#
__enter__()[source]#
__exit__(exc_type, exc_val, exc_tb)[source]#
_check_constraints_inside_gravity_region(constraints_df, grav_df)[source]#

check that all constraints are inside the region of the gravity data

Parameters:
Return type:

None

_check_gravity_inside_topography_region(grav_df, topography)[source]#

check that all gravity data is inside the region of the topography grid

Parameters:
Return type:

None

rmse(data, as_median=False)[source]#

function to give the root mean/median squared error (RMSE) of data

Parameters:
  • data (numpy.ndarray) – input data

  • as_median (bool, optional) – choose to give root median squared error instead, by default False

Returns:

RMSE value

Return type:

float

nearest_grid_fill(grid, method='verde', crs=None)[source]#

fill missing values in a grid with the nearest value.

Parameters:
  • grid (xarray.DataArray) – grid with missing values

  • method (str, optional) – choose method of filling, by default “verde”

  • crs (str | None, optional) – if method is ‘rioxarray’, provide the crs of the grid, in format ‘epsg:xxxx’, by default None

Returns:

filled grid

Return type:

xarray.DataArray

filter_grid(grid, filter_width=None, height_displacement=None, filt_type='lowpass', pad_width_factor=3, pad_mode='linear_ramp', pad_constant=None, pad_end_values=None)[source]#

Apply a spatial filter to a grid.

Parameters:
  • grid (xarray.DataArray) – grid to filter the values of

  • filter_width (float, optional) – width of the filter in meters, by default None

  • height_displacement (float, optional) – height displacement for upward continuation, relative to observation height, by default None

  • filt_type (str, optional) – type of filter to use from ‘lowpass’, ‘highpass’ ‘up_deriv’, ‘easting_deriv’, ‘northing_deriv’, ‘up_continue’, or ‘total_gradient’, by default “lowpass”

  • pad_width_factor (int, optional) – factor of grid width to pad the grid by, by default 3, which equates to a pad with a width of 1/3 of the grid width.

  • pad_mode (str, optional) – mode of padding, can be “linear”, by default “linear_ramp”

  • pad_constant (float | None, optional) – constant value to use for padding, by default None

  • pad_end_values (float | None, optional) – value to use for end of padding if pad_mode is “linear_ramp”, by default None

Returns:

a filtered grid

Return type:

xarray.DataArray

dist_nearest_points(targets, data, coord_names=None)[source]#

for all grid cells calculate to the distance to the nearest target.

Parameters:
Returns:

the distance to the nearest target for each gridcell, in the same format as the input for data.

Return type:

Any

normalize(x, low=0, high=1)[source]#

Normalize a list of numbers between provided values

Parameters:
  • x (NDArray) – numbers to normalize

  • low (float, optional) – lower value for normalization, by default 0

  • high (float, optional) – higher value for normalization, by default 1

Returns:

a normalized list of numbers

Return type:

NDArray

normalize_xarray(da, low=0, high=1)[source]#

Normalize a grid between provided values

Parameters:
  • da (xarray.DataArray) – grid to normalize

  • low (float, optional) – lower value for normalization, by default 0

  • high (float, optional) – higher value for normalization, by default 1

Returns:

a normalized grid

Return type:

xarray.DataArray

scale_normalized(sample, bounds)[source]#

Rescales the sample space into the unit hypercube, bounds = [0,1]

Parameters:
  • sample (NDArray) – sampled values

  • bounds (NDArray) – bounds of the sampling

Returns:

sampled values normalized from 0 to 1

Return type:

NDArray

normalized_mindist(points, grid, low=None, high=None, mindist=None, region=None)[source]#

Find the minimum distance between each grid cell and the nearest point. If low and high are provided, normalize the min dists grid between these values. If region is provided, all grid cells outside region are set to a distance of 0.

Parameters:
  • points (pandas.DataFrame) – coordinates of the points

  • grid (xarray.DataArray) – gridded data to find min dists for each grid cell

  • low (float | None, optional) – lower value for normalization, by default None

  • high (float | None, optional) – higher value for normalization, by default None

  • mindist (float | None, optional) – the minimum allowed distance, all values below are set equal to, by default None

  • region (list[float] | None, optional) – bounding region for which all grid cells outside will be set to low, by default None

Returns:

grid of normalized minimum distances

Return type:

xarray.DataArray

sample_grids(df, grid, sampled_name, **kwargs)[source]#

Sample data at every point along a line

Parameters:
  • df (pandas.DataFrame) – Dataframe containing columns ‘x’, ‘y’, or columns with names defined by kwarg “coord_names”.

  • grid (str or xarray.DataArray) – Grid to sample, either file name or xarray.DataArray

  • sampled_name (str,) – Name for sampled column

  • kwargs (Any)

Returns:

Dataframe with new column (sampled_name) of sample values from (grid)

Return type:

pandas.DataFrame

extract_prism_data(prism_layer)[source]#

extract the grid spacing from the starting prism layer and adds variables ‘topo’ and ‘starting_topo’, which are the both the starting topography elevation. ‘starting_topo’ remains unchanged, while ‘topo’ is updated at each iteration.

Parameters:

prism_layer (xarray.Dataset) – starting model prism layer

Returns:

  • prisms_df (pandas.DataFrame) – dataframe of prism layer

  • prisms_ds (xarray.Dataset) – prism layer with added variables ‘topo’ and ‘starting_topo’

  • spacing (float) – spacing of prisms

  • topo_grid (xarray.DataArray) – grid of starting topography

Return type:

tuple[pandas.DataFrame, xarray.Dataset, float, xarray.DataArray]

get_spacing(prisms_df)[source]#

Extract spacing of harmonica prism layer using a dataframe representation.

Parameters:

prisms_df (pandas.DataFrame) – dataframe of harmonica prism layer

Returns:

spacing of prisms

Return type:

float

sample_bounding_surfaces(prisms_df, upper_confining_layer=None, lower_confining_layer=None)[source]#

sample upper and/or lower confining layers into prisms dataframe

Parameters:
  • prisms_df (pandas.DataFrame) – dataframe of prism properties

  • upper_confining_layer (xarray.DataArray | None, optional) – layer which the inverted topography should always be below, by default None

  • lower_confining_layer (xarray.DataArray | None, optional) – layer which the inverted topography should always be above, by default None

Returns:

a dataframe with added columns ‘upper_bounds’ and ‘lower_bounds’, which are the sampled values of the supplied confining grids.

Return type:

pandas.DataFrame

enforce_confining_surface(prisms_df, iteration_number)[source]#

alter the surface correction values to ensure when added to the current iteration’s topography it doesn’t intersect optional confining layers.

Parameters:
  • prisms_df (pandas.DataFrame) – prism layer dataframe with optional ‘upper_bounds’ or ‘lower_bounds’ columns, and current iteration’s topography.

  • iteration_number (int) – number of the current iteration, starting at 1 not 0

Returns:

a dataframe with added column ‘iter_{iteration_number}_correction

Return type:

pandas.DataFrame

apply_surface_correction(prisms_df, iteration_number)[source]#

update the prisms dataframe and dataset with the surface correction. Ensure that the updated surface doesn’t intersect the optional confining surfaces.

Parameters:
  • prisms_df (pandas.DataFrame) – dataframe of prism properties

  • iteration_number (int) – the iteration number, starting at 1 not 0

Returns:

updated prisms dataframe and correction grid

Return type:

tuple[pandas.DataFrame, xarray.DataArray]

update_prisms_ds(prisms_ds, correction_grid)[source]#

apply the corrections grid and update the prism tops, bottoms, topo, and densities.

Parameters:
Returns:

updated prism layer with new tops, bottoms, topo, and densities

Return type:

xarray.Dataset

add_updated_prism_properties(prisms_df, prisms_ds, iteration_number)[source]#

update the prisms dataframe the the new prism tops, bottoms, topo, and densities the iteration number, starting at 1 not 0

Parameters:
  • prisms_df (pandas.DataFrame) – dataframe of prism properties

  • prisms_ds (xarray.Dataset) – dataset of prism properties

  • iteration_number (int) – the iteration number, starting at 1 not 0

Returns:

updated prism dataframe with new tops, bottoms, topo, and densities

Return type:

pandas.DataFrame

create_topography(method, region, spacing, dampings=None, registration='g', upwards=None, constraints_df=None, weights=None, weights_col=None, upper_confining_layer=None, lower_confining_layer=None)[source]#

Create a grid of topography data from either the interpolation of point data or creating a grid of constant value. Optionally, a subset of point data can be interpolated and then merged with an existing grid. The this, constraints_df must contain two additional columns of booleans, inside which is True for points inside the region of interest, and False otherwise, and buffer which is True for points within a buffer region around the region of interest, and False otherwise. Inside and Buffer points are used to interpolated the data, and then the interpolated data (without the buffer zone) is merged with the points outside the region of interest.

Parameters:
  • method (str) – method to use, either ‘flat’ or ‘splines’

  • region (tuple[float, float, float, float]) – region of the grid

  • spacing (float) – spacing of the grid

  • dampings (list[float] | None, optional) – damping values to use in spline cross validation for method “spline”, by default None

  • registration (str, optional) – choose between gridline “g” or pixel “p” registration, by default “g”

  • upwards (float | None, optional) – constant value to use for method “flat”, by default None

  • constraints_df (pandas.DataFrame | None, optional) – dataframe with column ‘upwards’ to use for method “splines”, and optionally columns ‘inside’ and ‘buffer’, by default None

  • weights (pandas.Series | numpy.ndarray | None, optional) – weight to use for fitting the spline. Typically, this should be 1 over the data uncertainty squared, by default None

  • weights_col (str | None, optional) – instead of passing the weights, pass the name of the column containing the weights, by default None

  • upper_confining_layer (xarray.DataArray | None, optional) – layer which the inverted topography should always be below, by default None

  • lower_confining_layer (xarray.DataArray | None, optional) – layer which the inverted topography should always be above, by default None

Returns:

a topography grid

Return type:

xarray.DataArray

grids_to_prisms(surface, reference, density, input_coord_names=('easting', 'northing'))[source]#

create a Harmonica layer of prisms with assigned densities.

Parameters:
  • surface (xarray.DataArray) – data to use for prism surface

  • reference (float | xarray.DataArray) – data or constant to use for prism reference, if value is below surface, prism will be inverted

  • density (float | int | xarray.DataArray) – data or constant to use for prism densities, should be in the form of a density contrast across a surface (i.e. between air and rock).

  • input_coord_names (tuple[str, str], optional) – names of the coordinates in the input dataarray, by default (“easting”, “northing”)

Returns:

a prisms layer with assigned densities

Return type:

xarray.Dataset

best_spline_cv(coordinates, data, weights=None, **kwargs)[source]#

Find the best damping parameter for a verde.SplineCV() fit. All kwargs are passed to the verde.SplineCV class.

Parameters:
Keyword Arguments:
  • dampings (float | None) – The positive damping regularization parameter. Controls how much smoothness is imposed on the estimated forces. If None, no regularization is used, by default None

  • force_coords (bool) – The easting and northing coordinates of the point forces. If None (default), then will be set to the data coordinates.

  • cv (None | cross-validation generator) – Any scikit-learn cross-validation generator. If not given, will use the default set by verde.cross_val_score.

  • delayed (bool) – If True, will use dask.delayed.delayed to dispatch computations and allow dask to execute the grid search in parallel (see note above).

  • scoring (None | str | Callable) – The scoring function (or name of a function) used for cross-validation. Must be known to scikit-learn. See the description of scoring in sklearn.model_selection.cross_val_score for details. If None, will fall back to the verde.Spline.score method.

Returns:

the spline which best fits the data

Return type:

verde.Spline

best_equivalent_source_damping(coordinates, data, delayed=False, weights=None, **kwargs)[source]#

Find the best damping parameter for a harmonica.EquivalentSource() fit. All kwargs are passed to the harmonica.EquivalentSource class.

Parameters:
Keyword Arguments:
  • damping (float | None) – The positive damping regularization parameter. Controls how much smoothness is imposed on the estimated coefficients. If None, no regularization is used.

  • points (list[numpy.ndarray] | None) – List containing the coordinates of the equivalent point sources. Coordinates are assumed to be in the following order: (easting, northing, upward). If None, will place one point source below each observation point at a fixed relative depth below the observation point. Defaults to None.

  • depth (float or str) – Parameter used to control the depth at which the point sources will be located. If a value is provided, each source is located beneath each data point (or block-averaged location) at a depth equal to its elevation minus the depth value. If set to "default", the depth of the sources will be estimated as 4.5 times the mean distance between first neighboring sources. This parameter is ignored if points is specified. Defaults to "default".

  • block_size (float | tuple[float, float] | None) – Size of the blocks used on block-averaged equivalent sources. If a single value is passed, the blocks will have a square shape. Alternatively, the dimensions of the blocks in the South-North and West-East directions can be specified by passing a tuple. If None, no block-averaging is applied. This parameter is ignored if points are specified. Default to None.

  • parallel (bool) – If True any predictions and Jacobian building is carried out in parallel through Numba’s jit.prange, reducing the computation time. If False, these tasks will be run on a single CPU. Default to True.

  • dtype (str) – The desired data-type for the predictions and the Jacobian matrix. Default to "float64".

Returns:

the best fitted equivalent sources

Return type:

harmonica.EquivalentSources

eq_sources_score(kwargs)[source]#

deprecated function, use cross_validation.eq_sources_score instead.

Parameters:

kwargs (Any)

Return type:

float

gravity_decay_buffer(buffer_perc, spacing, inner_region, top, zref, obs_height, density, amplitude=None, wavelength=None, checkerboard=False, as_density_contrast=False, plot=True, plot_profile=True, progressbar=False)[source]#

For a given buffer zone width (as percentage of x or y range) and domain parameters, calculate the max percent decay of the gravity anomaly within the region of interest.

Parameters:
  • buffer_perc (float) – percentage of the widest dimension of inner_region to use as buffer zone

  • spacing (float) – spacing of the prism layer and gravity observation points

  • inner_region (tuple[float, float, float, float]) – region boundaries for the region of interest

  • top (float) – height for the top of the prisms

  • zref (float) – reference level for the prisms

  • obs_height (float) – gravity observation height

  • density (float) – density value for the prisms

  • amplitude (float | None, optional) – if using checkerboard, this is the amplitude of each undulation, by default None

  • wavelength (float | None, optional) – if using checkerboard, this is the wavelength of each undulation, by default None

  • checkerboard (bool, optional) – use an undulating checkerboard for the topography instead of a flat surface, by default False

  • as_density_contrast (bool, optional) – discretize the topography as a density contrast, resulting in no edge effects, by default False

  • plot (bool, optional) – plot the results, by default True

  • plot_profile (bool, optional) – plot a profile across the prism layer, by default True

  • progressbar (bool, optional) – show a progressbar for the forward gravity calculation, by default False

Returns:

  • max_decay (float) – the maximum percentage decay of the gravity anomaly within the region of interest

  • buffer_width (float) – width of the buffer zone

  • buffer_cells (int) – number of cells in the buffer zone

  • grav_ds (xarray.Dataset) – dataset of the forward gravity calculations

Return type:

tuple[float, float, int, xarray.Dataset]