invert4geom.utils

invert4geom.utils#

Classes#

DuplicateFilter

Filters away duplicate log messages.

Functions#

`_log_level`(level)	Run body with logger at a different level
`environ`(**env)	temporarily set/reset an environment variable
`_check_constraints_inside_gravity_region`(...)	check that all constraints are inside the region of the gravity data
`_check_gravity_inside_topography_region`(grav_df, ...)	check that all gravity data is inside the region of the topography grid
`rmse`(data[, as_median])	function to give the root mean/median squared error (RMSE) of data
`nearest_grid_fill`(grid[, method, crs])	fill missing values in a grid with the nearest value.
`filter_grid`(grid[, filter_width, height_displacement, ...])	Apply a spatial filter to a grid.
`dist_nearest_points`(targets, data[, coord_names])	for all grid cells calculate to the distance to the nearest target.
`normalize`(x[, low, high])	Normalize a list of numbers between provided values
`normalize_xarray`(da[, low, high])	Normalize a grid between provided values
`scale_normalized`(sample, bounds)	Rescales the sample space into the unit hypercube, bounds = [0,1]
`normalized_mindist`(points, grid[, low, high, mindist, ...])	Find the minimum distance between each grid cell and the nearest point. If low and
`sample_grids`(df, grid, sampled_name, **kwargs)	Sample data at every point along a line
`extract_prism_data`(prism_layer)	extract the grid spacing from the starting prism layer and adds variables 'topo' and
`get_spacing`(prisms_df)	Extract spacing of harmonica prism layer using a dataframe representation.
`sample_bounding_surfaces`(prisms_df[, ...])	sample upper and/or lower confining layers into prisms dataframe
`enforce_confining_surface`(prisms_df, iteration_number)	alter the surface correction values to ensure when added to the current iteration's
`apply_surface_correction`(prisms_df, iteration_number)	update the prisms dataframe and dataset with the surface correction. Ensure that
`update_prisms_ds`(prisms_ds, correction_grid)	apply the corrections grid and update the prism tops, bottoms, topo, and
`add_updated_prism_properties`(prisms_df, prisms_ds, ...)	update the prisms dataframe the the new prism tops, bottoms, topo, and densities
`create_topography`(method, region, spacing[, dampings, ...])	Create a grid of topography data from either the interpolation of point data or
`grids_to_prisms`(surface, reference, density[, ...])	create a Harmonica layer of prisms with assigned densities.
`best_spline_cv`(coordinates, data[, weights])	Find the best damping parameter for a verde.SplineCV() fit. All kwargs are passed to
`best_equivalent_source_damping`(coordinates, data[, ...])	Find the best damping parameter for a harmonica.EquivalentSource() fit. All kwargs
`eq_sources_score`(kwargs)	deprecated function, use cross_validation.eq_sources_score instead.
`gravity_decay_buffer`(buffer_perc, spacing, ...[, ...])	For a given buffer zone width (as percentage of x or y range) and domain parameters,

Module Contents#

_log_level(level)[source]#: Run body with logger at a different level

environ(**env)[source]#: temporarily set/reset an environment variable

class DuplicateFilter(logger)[source]#

Filters away duplicate log messages. Adapted from https://stackoverflow.com/a/60462619/18686384

msgs[source]#

logger[source]#

filter(record)[source]#

__enter__()[source]#

__exit__(exc_type, exc_val, exc_tb)[source]#

_check_constraints_inside_gravity_region(constraints_df, grav_df)[source]#

check that all constraints are inside the region of the gravity data

Parameters:

constraints_df (pandas.DataFrame)
grav_df (pandas.DataFrame)

Return type:

None

_check_gravity_inside_topography_region(grav_df, topography)[source]#

check that all gravity data is inside the region of the topography grid

Parameters:

grav_df (pandas.DataFrame)
topography (xarray.DataArray)

Return type:

None

rmse(data, as_median=False)[source]#

function to give the root mean/median squared error (RMSE) of data

Parameters:

data (numpy.ndarray) – input data
as_median (bool, optional) – choose to give root median squared error instead, by default False

Returns:

RMSE value

Return type:

float

nearest_grid_fill(grid, method='verde', crs=None)[source]#

fill missing values in a grid with the nearest value.

Parameters:

grid (xarray.DataArray) – grid with missing values
method (str, optional) – choose method of filling, by default “verde”
crs (str | None, optional) – if method is ‘rioxarray’, provide the crs of the grid, in format ‘epsg:xxxx’, by default None

Returns:

filled grid

Return type:

xarray.DataArray

filter_grid(grid, filter_width=None, height_displacement=None, filt_type='lowpass', pad_width_factor=3, pad_mode='linear_ramp', pad_constant=None, pad_end_values=None)[source]#

Apply a spatial filter to a grid.

Parameters:

grid (xarray.DataArray) – grid to filter the values of
filter_width (float, optional) – width of the filter in meters, by default None
height_displacement (float, optional) – height displacement for upward continuation, relative to observation height, by default None
filt_type (str, optional) – type of filter to use from ‘lowpass’, ‘highpass’ ‘up_deriv’, ‘easting_deriv’, ‘northing_deriv’, ‘up_continue’, or ‘total_gradient’, by default “lowpass”
pad_width_factor (int, optional) – factor of grid width to pad the grid by, by default 3, which equates to a pad with a width of 1/3 of the grid width.
pad_mode (str, optional) – mode of padding, can be “linear”, by default “linear_ramp”
pad_constant (float | None, optional) – constant value to use for padding, by default None
pad_end_values (float | None, optional) – value to use for end of padding if pad_mode is “linear_ramp”, by default None

Returns:

a filtered grid

Return type:

xarray.DataArray

dist_nearest_points(targets, data, coord_names=None)[source]#

for all grid cells calculate to the distance to the nearest target.

Parameters:

targets (pandas.DataFrame) – contains the coordinates of the targets
data (pandas.DataFrame | xarray.DataArray | xarray.Dataset) – the grid data, in either gridded or tabular form
coord_names (tuple[str, str] | None, optional) – the names of the coordinates for both the targets and the data, by default None

Returns:

the distance to the nearest target for each gridcell, in the same format as the input for data.

Return type:

Any

normalize(x, low=0, high=1)[source]#

Normalize a list of numbers between provided values

Parameters:

x (NDArray) – numbers to normalize
low (float, optional) – lower value for normalization, by default 0
high (float, optional) – higher value for normalization, by default 1

Returns:

a normalized list of numbers

Return type:

NDArray

normalize_xarray(da, low=0, high=1)[source]#

Normalize a grid between provided values

Parameters:

da (xarray.DataArray) – grid to normalize
low (float, optional) – lower value for normalization, by default 0
high (float, optional) – higher value for normalization, by default 1

Returns:

a normalized grid

Return type:

xarray.DataArray

scale_normalized(sample, bounds)[source]#

Rescales the sample space into the unit hypercube, bounds = [0,1]

Parameters:

sample (NDArray) – sampled values
bounds (NDArray) – bounds of the sampling

Returns:

sampled values normalized from 0 to 1

Return type:

NDArray

normalized_mindist(points, grid, low=None, high=None, mindist=None, region=None)[source]#

Find the minimum distance between each grid cell and the nearest point. If low and high are provided, normalize the min dists grid between these values. If region is provided, all grid cells outside region are set to a distance of 0.

Parameters:

points (pandas.DataFrame) – coordinates of the points
grid (xarray.DataArray) – gridded data to find min dists for each grid cell
low (float | None, optional) – lower value for normalization, by default None
high (float | None, optional) – higher value for normalization, by default None
mindist (float | None, optional) – the minimum allowed distance, all values below are set equal to, by default None
region (list[float] | None, optional) – bounding region for which all grid cells outside will be set to low, by default None

Returns:

grid of normalized minimum distances

Return type:

xarray.DataArray

sample_grids(df, grid, sampled_name, **kwargs)[source]#

Sample data at every point along a line

Parameters:

df (pandas.DataFrame) – Dataframe containing columns ‘x’, ‘y’, or columns with names defined by kwarg “coord_names”.
grid (str or xarray.DataArray) – Grid to sample, either file name or xarray.DataArray
sampled_name (str,) – Name for sampled column
kwargs (Any)

Returns:

Dataframe with new column (sampled_name) of sample values from (grid)

Return type:

pandas.DataFrame

extract_prism_data(prism_layer)[source]#

extract the grid spacing from the starting prism layer and adds variables ‘topo’ and ‘starting_topo’, which are the both the starting topography elevation. ‘starting_topo’ remains unchanged, while ‘topo’ is updated at each iteration.

Parameters:

prism_layer (xarray.Dataset) – starting model prism layer

Returns:

prisms_df (pandas.DataFrame) – dataframe of prism layer
prisms_ds (xarray.Dataset) – prism layer with added variables ‘topo’ and ‘starting_topo’
spacing (float) – spacing of prisms
topo_grid (xarray.DataArray) – grid of starting topography

Return type:

tuple[pandas.DataFrame, xarray.Dataset, float, xarray.DataArray]

get_spacing(prisms_df)[source]#

Extract spacing of harmonica prism layer using a dataframe representation.

Parameters:: prisms_df (pandas.DataFrame) – dataframe of harmonica prism layer
Returns:: spacing of prisms
Return type:: float

sample_bounding_surfaces(prisms_df, upper_confining_layer=None, lower_confining_layer=None)[source]#

sample upper and/or lower confining layers into prisms dataframe

Parameters:

prisms_df (pandas.DataFrame) – dataframe of prism properties
upper_confining_layer (xarray.DataArray | None, optional) – layer which the inverted topography should always be below, by default None
lower_confining_layer (xarray.DataArray | None, optional) – layer which the inverted topography should always be above, by default None

Returns:

a dataframe with added columns ‘upper_bounds’ and ‘lower_bounds’, which are the sampled values of the supplied confining grids.

Return type:

pandas.DataFrame

enforce_confining_surface(prisms_df, iteration_number)[source]#

alter the surface correction values to ensure when added to the current iteration’s topography it doesn’t intersect optional confining layers.

Parameters:

prisms_df (pandas.DataFrame) – prism layer dataframe with optional ‘upper_bounds’ or ‘lower_bounds’ columns, and current iteration’s topography.
iteration_number (int) – number of the current iteration, starting at 1 not 0

Returns:

a dataframe with added column ‘iter_{iteration_number}_correction

Return type:

pandas.DataFrame

apply_surface_correction(prisms_df, iteration_number)[source]#

update the prisms dataframe and dataset with the surface correction. Ensure that the updated surface doesn’t intersect the optional confining surfaces.

Parameters:

prisms_df (pandas.DataFrame) – dataframe of prism properties
iteration_number (int) – the iteration number, starting at 1 not 0

Returns:

updated prisms dataframe and correction grid

Return type:

tuple[pandas.DataFrame, xarray.DataArray]

update_prisms_ds(prisms_ds, correction_grid)[source]#

apply the corrections grid and update the prism tops, bottoms, topo, and densities.

Parameters:

prisms_ds (xarray.Dataset) – harmonica prism layer
correction_grid (xarray.DataArray) – grid of corrections to apply to the prism layer

Returns:

updated prism layer with new tops, bottoms, topo, and densities

Return type:

xarray.Dataset

add_updated_prism_properties(prisms_df, prisms_ds, iteration_number)[source]#

update the prisms dataframe the the new prism tops, bottoms, topo, and densities the iteration number, starting at 1 not 0

Parameters:

prisms_df (pandas.DataFrame) – dataframe of prism properties
prisms_ds (xarray.Dataset) – dataset of prism properties
iteration_number (int) – the iteration number, starting at 1 not 0

Returns:

updated prism dataframe with new tops, bottoms, topo, and densities

Return type:

pandas.DataFrame

create_topography(method, region, spacing, dampings=None, registration='g', upwards=None, constraints_df=None, weights=None, weights_col=None, upper_confining_layer=None, lower_confining_layer=None)[source]#

Create a grid of topography data from either the interpolation of point data or creating a grid of constant value. Optionally, a subset of point data can be interpolated and then merged with an existing grid. The this, constraints_df must contain two additional columns of booleans, inside which is True for points inside the region of interest, and False otherwise, and buffer which is True for points within a buffer region around the region of interest, and False otherwise. Inside and Buffer points are used to interpolated the data, and then the interpolated data (without the buffer zone) is merged with the points outside the region of interest.

Parameters:

method (str) – method to use, either ‘flat’ or ‘splines’
region (tuple[float, float, float, float]) – region of the grid
spacing (float) – spacing of the grid
dampings (list[float] | None, optional) – damping values to use in spline cross validation for method “spline”, by default None
registration (str, optional) – choose between gridline “g” or pixel “p” registration, by default “g”
upwards (float | None, optional) – constant value to use for method “flat”, by default None
constraints_df (pandas.DataFrame | None, optional) – dataframe with column ‘upwards’ to use for method “splines”, and optionally columns ‘inside’ and ‘buffer’, by default None
weights (pandas.Series | numpy.ndarray | None, optional) – weight to use for fitting the spline. Typically, this should be 1 over the data uncertainty squared, by default None
weights_col (str | None, optional) – instead of passing the weights, pass the name of the column containing the weights, by default None
upper_confining_layer (xarray.DataArray | None, optional) – layer which the inverted topography should always be below, by default None
lower_confining_layer (xarray.DataArray | None, optional) – layer which the inverted topography should always be above, by default None

Returns:

a topography grid

Return type:

xarray.DataArray

grids_to_prisms(surface, reference, density, input_coord_names=('easting', 'northing'))[source]#

create a Harmonica layer of prisms with assigned densities.

Parameters:

surface (xarray.DataArray) – data to use for prism surface
reference (float | xarray.DataArray) – data or constant to use for prism reference, if value is below surface, prism will be inverted
density (float | int | xarray.DataArray) – data or constant to use for prism densities, should be in the form of a density contrast across a surface (i.e. between air and rock).
input_coord_names (tuple[str, str], optional) – names of the coordinates in the input dataarray, by default (“easting”, “northing”)

Returns:

a prisms layer with assigned densities

Return type:

xarray.Dataset

best_spline_cv(coordinates, data, weights=None, **kwargs)[source]#

Find the best damping parameter for a verde.SplineCV() fit. All kwargs are passed to the verde.SplineCV class.

Parameters:

coordinates (tuple[pandas.Series | numpy.ndarray, pandas.Series | numpy.ndarray]) – easting and northing coordinates of the data
data (pandas.Series | numpy.ndarray) – data for fitting the spline to
weights (pandas.Series | numpy.ndarray | None, optional) – if not None, then the weights assigned to each data point. Typically, this should be 1 over the data uncertainty squared, by default None
kwargs (Any)

Keyword Arguments:

dampings (float | None) – The positive damping regularization parameter. Controls how much smoothness is imposed on the estimated forces. If None, no regularization is used, by default None
force_coords (bool) – The easting and northing coordinates of the point forces. If None (default), then will be set to the data coordinates.
cv (None | cross-validation generator) – Any scikit-learn cross-validation generator. If not given, will use the default set by verde.cross_val_score.
delayed (bool) – If True, will use dask.delayed.delayed to dispatch computations and allow dask to execute the grid search in parallel (see note above).
scoring (None | str | Callable) – The scoring function (or name of a function) used for cross-validation. Must be known to scikit-learn. See the description of scoring in sklearn.model_selection.cross_val_score for details. If None, will fall back to the verde.Spline.score method.

Returns:

the spline which best fits the data

Return type:

verde.Spline

best_equivalent_source_damping(coordinates, data, delayed=False, weights=None, **kwargs)[source]#

Find the best damping parameter for a harmonica.EquivalentSource() fit. All kwargs are passed to the harmonica.EquivalentSource class.

Parameters:

coordinates (tuple[pandas.Series | numpy.ndarray, pandas.Series | numpy.ndarray, pandas.Series | numpy.ndarray]) – tuple of easting, northing, and upward coordinates of the gravity data
data (pandas.Series | numpy.ndarray) – the gravity data
delayed (bool, optional) – compute the scores in parallel if True, by default False
weights (numpy.ndarray | None, optional) – optional weight values for each gravity data point, by default None
kwargs (Any)

Keyword Arguments:

damping (float | None) – The positive damping regularization parameter. Controls how much smoothness is imposed on the estimated coefficients. If None, no regularization is used.
points (list[numpy.ndarray] | None) – List containing the coordinates of the equivalent point sources. Coordinates are assumed to be in the following order: (easting, northing, upward). If None, will place one point source below each observation point at a fixed relative depth below the observation point. Defaults to None.
depth (float or str) – Parameter used to control the depth at which the point sources will be located. If a value is provided, each source is located beneath each data point (or block-averaged location) at a depth equal to its elevation minus the depth value. If set to "default", the depth of the sources will be estimated as 4.5 times the mean distance between first neighboring sources. This parameter is ignored if points is specified. Defaults to "default".
block_size (float | tuple[float, float] | None) – Size of the blocks used on block-averaged equivalent sources. If a single value is passed, the blocks will have a square shape. Alternatively, the dimensions of the blocks in the South-North and West-East directions can be specified by passing a tuple. If None, no block-averaging is applied. This parameter is ignored if points are specified. Default to None.
parallel (bool) – If True any predictions and Jacobian building is carried out in parallel through Numba’s jit.prange, reducing the computation time. If False, these tasks will be run on a single CPU. Default to True.
dtype (str) – The desired data-type for the predictions and the Jacobian matrix. Default to "float64".

Returns:

the best fitted equivalent sources

Return type:

harmonica.EquivalentSources

eq_sources_score(kwargs)[source]#

deprecated function, use cross_validation.eq_sources_score instead.

Parameters:: kwargs (Any)
Return type:: float

gravity_decay_buffer(buffer_perc, spacing, inner_region, top, zref, obs_height, density, amplitude=None, wavelength=None, checkerboard=False, as_density_contrast=False, plot=True, plot_profile=True, progressbar=False)[source]#

For a given buffer zone width (as percentage of x or y range) and domain parameters, calculate the max percent decay of the gravity anomaly within the region of interest.

Parameters:

buffer_perc (float) – percentage of the widest dimension of inner_region to use as buffer zone
spacing (float) – spacing of the prism layer and gravity observation points
inner_region (tuple[float, float, float, float]) – region boundaries for the region of interest
top (float) – height for the top of the prisms
zref (float) – reference level for the prisms
obs_height (float) – gravity observation height
density (float) – density value for the prisms
amplitude (float | None, optional) – if using checkerboard, this is the amplitude of each undulation, by default None
wavelength (float | None, optional) – if using checkerboard, this is the wavelength of each undulation, by default None
checkerboard (bool, optional) – use an undulating checkerboard for the topography instead of a flat surface, by default False
as_density_contrast (bool, optional) – discretize the topography as a density contrast, resulting in no edge effects, by default False
plot (bool, optional) – plot the results, by default True
plot_profile (bool, optional) – plot a profile across the prism layer, by default True
progressbar (bool, optional) – show a progressbar for the forward gravity calculation, by default False

Returns:

max_decay (float) – the maximum percentage decay of the gravity anomaly within the region of interest
buffer_width (float) – width of the buffer zone
buffer_cells (int) – number of cells in the buffer zone
grav_ds (xarray.Dataset) – dataset of the forward gravity calculations

Return type:

tuple[float, float, int, xarray.Dataset]